1. 11 1月, 2020 1 次提交
  2. 10 1月, 2020 2 次提交
    • A
      libbpf: Make bpf_map order and indices stable · 492ab020
      Andrii Nakryiko 提交于
      Currently, libbpf re-sorts bpf_map structs after all the maps are added and
      initialized, which might change their relative order and invalidate any
      bpf_map pointer or index taken before that. This is inconvenient and
      error-prone. For instance, it can cause .kconfig map index to point to a wrong
      map.
      
      Furthermore, libbpf itself doesn't rely on any specific ordering of bpf_maps,
      so it's just an unnecessary complication right now. This patch drops sorting
      of maps and makes their relative positions fixed. If efficient index is ever
      needed, it's better to have a separate array of pointers as a search index,
      instead of reordering bpf_map struct in-place. This will be less error-prone
      and will allow multiple independent orderings, if necessary (e.g., either by
      section index or by name).
      
      Fixes: 166750bc ("libbpf: Support libbpf-provided extern variables")
      Reported-by: NMartin KaFai Lau <kafai@fb.com>
      Signed-off-by: NAndrii Nakryiko <andriin@fb.com>
      Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
      Link: https://lore.kernel.org/bpf/20200110034247.1220142-1-andriin@fb.com
      492ab020
    • M
      bpf: libbpf: Add STRUCT_OPS support · 590a0088
      Martin KaFai Lau 提交于
      This patch adds BPF STRUCT_OPS support to libbpf.
      
      The only sec_name convention is SEC(".struct_ops") to identify the
      struct_ops implemented in BPF,
      e.g. To implement a tcp_congestion_ops:
      
      SEC(".struct_ops")
      struct tcp_congestion_ops dctcp = {
      	.init           = (void *)dctcp_init,  /* <-- a bpf_prog */
      	/* ... some more func prts ... */
      	.name           = "bpf_dctcp",
      };
      
      Each struct_ops is defined as a global variable under SEC(".struct_ops")
      as above.  libbpf creates a map for each variable and the variable name
      is the map's name.  Multiple struct_ops is supported under
      SEC(".struct_ops").
      
      In the bpf_object__open phase, libbpf will look for the SEC(".struct_ops")
      section and find out what is the btf-type the struct_ops is
      implementing.  Note that the btf-type here is referring to
      a type in the bpf_prog.o's btf.  A "struct bpf_map" is added
      by bpf_object__add_map() as other maps do.  It will then
      collect (through SHT_REL) where are the bpf progs that the
      func ptrs are referring to.  No btf_vmlinux is needed in
      the open phase.
      
      In the bpf_object__load phase, the map-fields, which depend
      on the btf_vmlinux, are initialized (in bpf_map__init_kern_struct_ops()).
      It will also set the prog->type, prog->attach_btf_id, and
      prog->expected_attach_type.  Thus, the prog's properties do
      not rely on its section name.
      [ Currently, the bpf_prog's btf-type ==> btf_vmlinux's btf-type matching
        process is as simple as: member-name match + btf-kind match + size match.
        If these matching conditions fail, libbpf will reject.
        The current targeting support is "struct tcp_congestion_ops" which
        most of its members are function pointers.
        The member ordering of the bpf_prog's btf-type can be different from
        the btf_vmlinux's btf-type. ]
      
      Then, all obj->maps are created as usual (in bpf_object__create_maps()).
      
      Once the maps are created and prog's properties are all set,
      the libbpf will proceed to load all the progs.
      
      bpf_map__attach_struct_ops() is added to register a struct_ops
      map to a kernel subsystem.
      Signed-off-by: NMartin KaFai Lau <kafai@fb.com>
      Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
      Link: https://lore.kernel.org/bpf/20200109003514.3856730-1-kafai@fb.com
      590a0088
  3. 26 12月, 2019 1 次提交
    • A
      libbpf: Support CO-RE relocations for LDX/ST/STX instructions · 8ab9da57
      Andrii Nakryiko 提交于
      Clang patch [0] enables emitting relocatable generic ALU/ALU64 instructions
      (i.e, shifts and arithmetic operations), as well as generic load/store
      instructions. The former ones are already supported by libbpf as is. This
      patch adds further support for load/store instructions. Relocatable field
      offset is encoded in BPF instruction's 16-bit offset section and are adjusted
      by libbpf based on target kernel BTF.
      
      These Clang changes and corresponding libbpf changes allow for more succinct
      generated BPF code by encoding relocatable field reads as a single
      ST/LDX/STX instruction. It also enables relocatable access to BPF context.
      Previously, if context struct (e.g., __sk_buff) was accessed with CO-RE
      relocations (e.g., due to preserve_access_index attribute), it would be
      rejected by BPF verifier due to modified context pointer dereference. With
      Clang patch, such context accesses are both relocatable and have a fixed
      offset from the point of view of BPF verifier.
      
        [0] https://reviews.llvm.org/D71790Signed-off-by: NAndrii Nakryiko <andriin@fb.com>
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: NYonghong Song <yhs@fb.com>
      Link: https://lore.kernel.org/bpf/20191223180305.86417-1-andriin@fb.com
      8ab9da57
  4. 19 12月, 2019 6 次提交
  5. 18 12月, 2019 1 次提交
  6. 17 12月, 2019 1 次提交
  7. 16 12月, 2019 11 次提交
  8. 14 12月, 2019 3 次提交
  9. 13 12月, 2019 1 次提交
  10. 28 11月, 2019 2 次提交
  11. 26 11月, 2019 1 次提交
  12. 25 11月, 2019 4 次提交
  13. 20 11月, 2019 1 次提交
    • A
      libbpf: Fix call relocation offset calculation bug · a0d7da26
      Andrii Nakryiko 提交于
      When relocating subprogram call, libbpf doesn't take into account
      relo->text_off, which comes from symbol's value. This generally works fine for
      subprograms implemented as static functions, but breaks for global functions.
      
      Taking a simplified test_pkt_access.c as an example:
      
      __attribute__ ((noinline))
      static int test_pkt_access_subprog1(volatile struct __sk_buff *skb)
      {
              return skb->len * 2;
      }
      
      __attribute__ ((noinline))
      static int test_pkt_access_subprog2(int val, volatile struct __sk_buff *skb)
      {
              return skb->len + val;
      }
      
      SEC("classifier/test_pkt_access")
      int test_pkt_access(struct __sk_buff *skb)
      {
              if (test_pkt_access_subprog1(skb) != skb->len * 2)
                      return TC_ACT_SHOT;
              if (test_pkt_access_subprog2(2, skb) != skb->len + 2)
                      return TC_ACT_SHOT;
              return TC_ACT_UNSPEC;
      }
      
      When compiled, we get two relocations, pointing to '.text' symbol. .text has
      st_value set to 0 (it points to the beginning of .text section):
      
      0000000000000008  000000050000000a R_BPF_64_32            0000000000000000 .text
      0000000000000040  000000050000000a R_BPF_64_32            0000000000000000 .text
      
      test_pkt_access_subprog1 and test_pkt_access_subprog2 offsets (targets of two
      calls) are encoded within call instruction's imm32 part as -1 and 2,
      respectively:
      
      0000000000000000 test_pkt_access_subprog1:
             0:       61 10 00 00 00 00 00 00 r0 = *(u32 *)(r1 + 0)
             1:       64 00 00 00 01 00 00 00 w0 <<= 1
             2:       95 00 00 00 00 00 00 00 exit
      
      0000000000000018 test_pkt_access_subprog2:
             3:       61 10 00 00 00 00 00 00 r0 = *(u32 *)(r1 + 0)
             4:       04 00 00 00 02 00 00 00 w0 += 2
             5:       95 00 00 00 00 00 00 00 exit
      
      0000000000000000 test_pkt_access:
             0:       bf 16 00 00 00 00 00 00 r6 = r1
      ===>   1:       85 10 00 00 ff ff ff ff call -1
             2:       bc 01 00 00 00 00 00 00 w1 = w0
             3:       b4 00 00 00 02 00 00 00 w0 = 2
             4:       61 62 00 00 00 00 00 00 r2 = *(u32 *)(r6 + 0)
             5:       64 02 00 00 01 00 00 00 w2 <<= 1
             6:       5e 21 08 00 00 00 00 00 if w1 != w2 goto +8 <LBB0_3>
             7:       bf 61 00 00 00 00 00 00 r1 = r6
      ===>   8:       85 10 00 00 02 00 00 00 call 2
             9:       bc 01 00 00 00 00 00 00 w1 = w0
            10:       61 62 00 00 00 00 00 00 r2 = *(u32 *)(r6 + 0)
            11:       04 02 00 00 02 00 00 00 w2 += 2
            12:       b4 00 00 00 ff ff ff ff w0 = -1
            13:       1e 21 01 00 00 00 00 00 if w1 == w2 goto +1 <LBB0_3>
            14:       b4 00 00 00 02 00 00 00 w0 = 2
      0000000000000078 LBB0_3:
            15:       95 00 00 00 00 00 00 00 exit
      
      Now, if we compile example with global functions, the setup changes.
      Relocations are now against specifically test_pkt_access_subprog1 and
      test_pkt_access_subprog2 symbols, with test_pkt_access_subprog2 pointing 24
      bytes into its respective section (.text), i.e., 3 instructions in:
      
      0000000000000008  000000070000000a R_BPF_64_32            0000000000000000 test_pkt_access_subprog1
      0000000000000048  000000080000000a R_BPF_64_32            0000000000000018 test_pkt_access_subprog2
      
      Calls instructions now encode offsets relative to function symbols and are both
      set ot -1:
      
      0000000000000000 test_pkt_access_subprog1:
             0:       61 10 00 00 00 00 00 00 r0 = *(u32 *)(r1 + 0)
             1:       64 00 00 00 01 00 00 00 w0 <<= 1
             2:       95 00 00 00 00 00 00 00 exit
      
      0000000000000018 test_pkt_access_subprog2:
             3:       61 20 00 00 00 00 00 00 r0 = *(u32 *)(r2 + 0)
             4:       0c 10 00 00 00 00 00 00 w0 += w1
             5:       95 00 00 00 00 00 00 00 exit
      
      0000000000000000 test_pkt_access:
             0:       bf 16 00 00 00 00 00 00 r6 = r1
      ===>   1:       85 10 00 00 ff ff ff ff call -1
             2:       bc 01 00 00 00 00 00 00 w1 = w0
             3:       b4 00 00 00 02 00 00 00 w0 = 2
             4:       61 62 00 00 00 00 00 00 r2 = *(u32 *)(r6 + 0)
             5:       64 02 00 00 01 00 00 00 w2 <<= 1
             6:       5e 21 09 00 00 00 00 00 if w1 != w2 goto +9 <LBB2_3>
             7:       b4 01 00 00 02 00 00 00 w1 = 2
             8:       bf 62 00 00 00 00 00 00 r2 = r6
      ===>   9:       85 10 00 00 ff ff ff ff call -1
            10:       bc 01 00 00 00 00 00 00 w1 = w0
            11:       61 62 00 00 00 00 00 00 r2 = *(u32 *)(r6 + 0)
            12:       04 02 00 00 02 00 00 00 w2 += 2
            13:       b4 00 00 00 ff ff ff ff w0 = -1
            14:       1e 21 01 00 00 00 00 00 if w1 == w2 goto +1 <LBB2_3>
            15:       b4 00 00 00 02 00 00 00 w0 = 2
      0000000000000080 LBB2_3:
            16:       95 00 00 00 00 00 00 00 exit
      
      Thus the right formula to calculate target call offset after relocation should
      take into account relocation's target symbol value (offset within section),
      call instruction's imm32 offset, and (subtracting, to get relative instruction
      offset) instruction index of call instruction itself. All that is shifted by
      number of instructions in main program, given all sub-programs are copied over
      after main program.
      
      Convert few selftests relying on bpf-to-bpf calls to use global functions
      instead of static ones.
      
      Fixes: 48cca7e4 ("libbpf: add support for bpf_call")
      Reported-by: NAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: NAndrii Nakryiko <andriin@fb.com>
      Acked-by: NYonghong Song <yhs@fb.com>
      Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
      Link: https://lore.kernel.org/bpf/20191119224447.3781271-1-andriin@fb.com
      a0d7da26
  14. 18 11月, 2019 1 次提交
  15. 16 11月, 2019 2 次提交
  16. 11 11月, 2019 2 次提交