1. 26 5月, 2021 1 次提交
    • H
      xdp: Extend xdp_redirect_map with broadcast support · e624d4ed
      Hangbin Liu 提交于
      This patch adds two flags BPF_F_BROADCAST and BPF_F_EXCLUDE_INGRESS to
      extend xdp_redirect_map for broadcast support.
      
      With BPF_F_BROADCAST the packet will be broadcasted to all the interfaces
      in the map. with BPF_F_EXCLUDE_INGRESS the ingress interface will be
      excluded when do broadcasting.
      
      When getting the devices in dev hash map via dev_map_hash_get_next_key(),
      there is a possibility that we fall back to the first key when a device
      was removed. This will duplicate packets on some interfaces. So just walk
      the whole buckets to avoid this issue. For dev array map, we also walk the
      whole map to find valid interfaces.
      
      Function bpf_clear_redirect_map() was removed in
      commit ee75aef2 ("bpf, xdp: Restructure redirect actions").
      Add it back as we need to use ri->map again.
      
      With test topology:
        +-------------------+             +-------------------+
        | Host A (i40e 10G) |  ---------- | eno1(i40e 10G)    |
        +-------------------+             |                   |
                                          |   Host B          |
        +-------------------+             |                   |
        | Host C (i40e 10G) |  ---------- | eno2(i40e 10G)    |
        +-------------------+             |                   |
                                          |          +------+ |
                                          | veth0 -- | Peer | |
                                          | veth1 -- |      | |
                                          | veth2 -- |  NS  | |
                                          |          +------+ |
                                          +-------------------+
      
      On Host A:
       # pktgen/pktgen_sample03_burst_single_flow.sh -i eno1 -d $dst_ip -m $dst_mac -s 64
      
      On Host B(Intel(R) Xeon(R) CPU E5-2690 v3 @ 2.60GHz, 128G Memory):
      Use xdp_redirect_map and xdp_redirect_map_multi in samples/bpf for testing.
      All the veth peers in the NS have a XDP_DROP program loaded. The
      forward_map max_entries in xdp_redirect_map_multi is modify to 4.
      
      Testing the performance impact on the regular xdp_redirect path with and
      without patch (to check impact of additional check for broadcast mode):
      
      5.12 rc4         | redirect_map        i40e->i40e      |    2.0M |  9.7M
      5.12 rc4         | redirect_map        i40e->veth      |    1.7M | 11.8M
      5.12 rc4 + patch | redirect_map        i40e->i40e      |    2.0M |  9.6M
      5.12 rc4 + patch | redirect_map        i40e->veth      |    1.7M | 11.7M
      
      Testing the performance when cloning packets with the redirect_map_multi
      test, using a redirect map size of 4, filled with 1-3 devices:
      
      5.12 rc4 + patch | redirect_map multi  i40e->veth (x1) |    1.7M | 11.4M
      5.12 rc4 + patch | redirect_map multi  i40e->veth (x2) |    1.1M |  4.3M
      5.12 rc4 + patch | redirect_map multi  i40e->veth (x3) |    0.8M |  2.6M
      Signed-off-by: NHangbin Liu <liuhangbin@gmail.com>
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: NToke Høiland-Jørgensen <toke@redhat.com>
      Acked-by: NMartin KaFai Lau <kafai@fb.com>
      Acked-by: NJohn Fastabend <john.fastabend@gmail.com>
      Acked-by: NJesper Dangaard Brouer <brouer@redhat.com>
      Link: https://lore.kernel.org/bpf/20210519090747.1655268-3-liuhangbin@gmail.com
      e624d4ed
  2. 25 5月, 2021 1 次提交
  3. 19 5月, 2021 5 次提交
  4. 20 4月, 2021 1 次提交
  5. 14 4月, 2021 1 次提交
  6. 13 4月, 2021 1 次提交
  7. 12 4月, 2021 1 次提交
  8. 02 4月, 2021 1 次提交
  9. 27 3月, 2021 1 次提交
    • M
      bpf: Support bpf program calling kernel function · e6ac2450
      Martin KaFai Lau 提交于
      This patch adds support to BPF verifier to allow bpf program calling
      kernel function directly.
      
      The use case included in this set is to allow bpf-tcp-cc to directly
      call some tcp-cc helper functions (e.g. "tcp_cong_avoid_ai()").  Those
      functions have already been used by some kernel tcp-cc implementations.
      
      This set will also allow the bpf-tcp-cc program to directly call the
      kernel tcp-cc implementation,  For example, a bpf_dctcp may only want to
      implement its own dctcp_cwnd_event() and reuse other dctcp_*() directly
      from the kernel tcp_dctcp.c instead of reimplementing (or
      copy-and-pasting) them.
      
      The tcp-cc kernel functions mentioned above will be white listed
      for the struct_ops bpf-tcp-cc programs to use in a later patch.
      The white listed functions are not bounded to a fixed ABI contract.
      Those functions have already been used by the existing kernel tcp-cc.
      If any of them has changed, both in-tree and out-of-tree kernel tcp-cc
      implementations have to be changed.  The same goes for the struct_ops
      bpf-tcp-cc programs which have to be adjusted accordingly.
      
      This patch is to make the required changes in the bpf verifier.
      
      First change is in btf.c, it adds a case in "btf_check_func_arg_match()".
      When the passed in "btf->kernel_btf == true", it means matching the
      verifier regs' states with a kernel function.  This will handle the
      PTR_TO_BTF_ID reg.  It also maps PTR_TO_SOCK_COMMON, PTR_TO_SOCKET,
      and PTR_TO_TCP_SOCK to its kernel's btf_id.
      
      In the later libbpf patch, the insn calling a kernel function will
      look like:
      
      insn->code == (BPF_JMP | BPF_CALL)
      insn->src_reg == BPF_PSEUDO_KFUNC_CALL /* <- new in this patch */
      insn->imm == func_btf_id /* btf_id of the running kernel */
      
      [ For the future calling function-in-kernel-module support, an array
        of module btf_fds can be passed at the load time and insn->off
        can be used to index into this array. ]
      
      At the early stage of verifier, the verifier will collect all kernel
      function calls into "struct bpf_kfunc_desc".  Those
      descriptors are stored in "prog->aux->kfunc_tab" and will
      be available to the JIT.  Since this "add" operation is similar
      to the current "add_subprog()" and looking for the same insn->code,
      they are done together in the new "add_subprog_and_kfunc()".
      
      In the "do_check()" stage, the new "check_kfunc_call()" is added
      to verify the kernel function call instruction:
      1. Ensure the kernel function can be used by a particular BPF_PROG_TYPE.
         A new bpf_verifier_ops "check_kfunc_call" is added to do that.
         The bpf-tcp-cc struct_ops program will implement this function in
         a later patch.
      2. Call "btf_check_kfunc_args_match()" to ensure the regs can be
         used as the args of a kernel function.
      3. Mark the regs' type, subreg_def, and zext_dst.
      
      At the later do_misc_fixups() stage, the new fixup_kfunc_call()
      will replace the insn->imm with the function address (relative
      to __bpf_call_base).  If needed, the jit can find the btf_func_model
      by calling the new bpf_jit_find_kfunc_model(prog, insn).
      With the imm set to the function address, "bpftool prog dump xlated"
      will be able to display the kernel function calls the same way as
      it displays other bpf helper calls.
      
      gpl_compatible program is required to call kernel function.
      
      This feature currently requires JIT.
      
      The verifier selftests are adjusted because of the changes in
      the verbose log in add_subprog_and_kfunc().
      Signed-off-by: NMartin KaFai Lau <kafai@fb.com>
      Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
      Link: https://lore.kernel.org/bpf/20210325015142.1544736-1-kafai@fb.com
      e6ac2450
  10. 09 3月, 2021 1 次提交
  11. 05 3月, 2021 11 次提交
  12. 27 2月, 2021 2 次提交
  13. 25 2月, 2021 1 次提交
  14. 13 2月, 2021 2 次提交
    • J
      bpf: Add BPF-helper for MTU checking · 34b2021c
      Jesper Dangaard Brouer 提交于
      This BPF-helper bpf_check_mtu() works for both XDP and TC-BPF programs.
      
      The SKB object is complex and the skb->len value (accessible from
      BPF-prog) also include the length of any extra GRO/GSO segments, but
      without taking into account that these GRO/GSO segments get added
      transport (L4) and network (L3) headers before being transmitted. Thus,
      this BPF-helper is created such that the BPF-programmer don't need to
      handle these details in the BPF-prog.
      
      The API is designed to help the BPF-programmer, that want to do packet
      context size changes, which involves other helpers. These other helpers
      usually does a delta size adjustment. This helper also support a delta
      size (len_diff), which allow BPF-programmer to reuse arguments needed by
      these other helpers, and perform the MTU check prior to doing any actual
      size adjustment of the packet context.
      
      It is on purpose, that we allow the len adjustment to become a negative
      result, that will pass the MTU check. This might seem weird, but it's not
      this helpers responsibility to "catch" wrong len_diff adjustments. Other
      helpers will take care of these checks, if BPF-programmer chooses to do
      actual size adjustment.
      
      V14:
       - Improve man-page desc of len_diff.
      
      V13:
       - Enforce flag BPF_MTU_CHK_SEGS cannot use len_diff.
      
      V12:
       - Simplify segment check that calls skb_gso_validate_network_len.
       - Helpers should return long
      
      V9:
      - Use dev->hard_header_len (instead of ETH_HLEN)
      - Annotate with unlikely req from Daniel
      - Fix logic error using skb_gso_validate_network_len from Daniel
      
      V6:
      - Took John's advice and dropped BPF_MTU_CHK_RELAX
      - Returned MTU is kept at L3-level (like fib_lookup)
      
      V4: Lot of changes
       - ifindex 0 now use current netdev for MTU lookup
       - rename helper from bpf_mtu_check to bpf_check_mtu
       - fix bug for GSO pkt length (as skb->len is total len)
       - remove __bpf_len_adj_positive, simply allow negative len adj
      Signed-off-by: NJesper Dangaard Brouer <brouer@redhat.com>
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: NJohn Fastabend <john.fastabend@gmail.com>
      Link: https://lore.kernel.org/bpf/161287790461.790810.3429728639563297353.stgit@firesoul
      34b2021c
    • J
      bpf: bpf_fib_lookup return MTU value as output when looked up · e1850ea9
      Jesper Dangaard Brouer 提交于
      The BPF-helpers for FIB lookup (bpf_xdp_fib_lookup and bpf_skb_fib_lookup)
      can perform MTU check and return BPF_FIB_LKUP_RET_FRAG_NEEDED. The BPF-prog
      don't know the MTU value that caused this rejection.
      
      If the BPF-prog wants to implement PMTU (Path MTU Discovery) (rfc1191) it
      need to know this MTU value for the ICMP packet.
      
      Patch change lookup and result struct bpf_fib_lookup, to contain this MTU
      value as output via a union with 'tot_len' as this is the value used for
      the MTU lookup.
      
      V5:
       - Fixed uninit value spotted by Dan Carpenter.
       - Name struct output member mtu_result
      Reported-by: Nkernel test robot <lkp@intel.com>
      Reported-by: NDan Carpenter <dan.carpenter@oracle.com>
      Signed-off-by: NJesper Dangaard Brouer <brouer@redhat.com>
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: NJohn Fastabend <john.fastabend@gmail.com>
      Link: https://lore.kernel.org/bpf/161287789952.790810.13134700381067698781.stgit@firesoul
      e1850ea9
  15. 12 2月, 2021 2 次提交
  16. 11 2月, 2021 1 次提交
  17. 15 1月, 2021 3 次提交
  18. 13 1月, 2021 1 次提交
  19. 12 12月, 2020 1 次提交
  20. 05 12月, 2020 1 次提交
  21. 04 12月, 2020 1 次提交