1. 26 7月, 2016 2 次提交
    • S
      samples/bpf: Add test/example of using bpf_probe_write_user bpf helper · cf9b1199
      Sargun Dhillon 提交于
      This example shows using a kprobe to act as a dnat mechanism to divert
      traffic for arbitrary endpoints. It rewrite the arguments to a syscall
      while they're still in userspace, and before the syscall has a chance
      to copy the argument into kernel space.
      
      Although this is an example, it also acts as a test because the mapped
      address is 255.255.255.255:555 -> real address, and that's not a legal
      address to connect to. If the helper is broken, the example will fail
      on the intermediate steps, as well as the final step to verify the
      rewrite of userspace memory succeeded.
      Signed-off-by: NSargun Dhillon <sargun@sargun.me>
      Cc: Alexei Starovoitov <ast@kernel.org>
      Cc: Daniel Borkmann <daniel@iogearbox.net>
      Acked-by: NAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      cf9b1199
    • S
      bpf: Add bpf_probe_write_user BPF helper to be called in tracers · 96ae5227
      Sargun Dhillon 提交于
      This allows user memory to be written to during the course of a kprobe.
      It shouldn't be used to implement any kind of security mechanism
      because of TOC-TOU attacks, but rather to debug, divert, and
      manipulate execution of semi-cooperative processes.
      
      Although it uses probe_kernel_write, we limit the address space
      the probe can write into by checking the space with access_ok.
      We do this as opposed to calling copy_to_user directly, in order
      to avoid sleeping. In addition we ensure the threads's current fs
      / segment is USER_DS and the thread isn't exiting nor a kernel thread.
      
      Given this feature is meant for experiments, and it has a risk of
      crashing the system, and running programs, we print a warning on
      when a proglet that attempts to use this helper is installed,
      along with the pid and process name.
      Signed-off-by: NSargun Dhillon <sargun@sargun.me>
      Cc: Alexei Starovoitov <ast@kernel.org>
      Cc: Daniel Borkmann <daniel@iogearbox.net>
      Acked-by: NAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      96ae5227
  2. 21 7月, 2016 1 次提交
  3. 20 7月, 2016 2 次提交
    • B
      bpf: add sample for xdp forwarding and rewrite · 764cbcce
      Brenden Blanco 提交于
      Add a sample that rewrites and forwards packets out on the same
      interface. Observed single core forwarding performance of ~10Mpps.
      
      Since the mlx4 driver under test recycles every single packet page, the
      perf output shows almost exclusively just the ring management and bpf
      program work. Slowdowns are likely occurring due to cache misses.
      Signed-off-by: NBrenden Blanco <bblanco@plumgrid.com>
      Acked-by: NAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      764cbcce
    • B
      Add sample for adding simple drop program to link · 86af8b41
      Brenden Blanco 提交于
      Add a sample program that only drops packets at the BPF_PROG_TYPE_XDP_RX
      hook of a link. With the drop-only program, observed single core rate is
      ~20Mpps.
      
      Other tests were run, for instance without the dropcnt increment or
      without reading from the packet header, the packet rate was mostly
      unchanged.
      
      $ perf record -a samples/bpf/xdp1 $(</sys/class/net/eth0/ifindex)
      proto 17:   20403027 drops/s
      
      ./pktgen_sample03_burst_single_flow.sh -i $DEV -d $IP -m $MAC -t 4
      Running... ctrl^C to stop
      Device: eth4@0
      Result: OK: 11791017(c11788327+d2689) usec, 59622913 (60byte,0frags)
        5056638pps 2427Mb/sec (2427186240bps) errors: 0
      Device: eth4@1
      Result: OK: 11791012(c11787906+d3106) usec, 60526944 (60byte,0frags)
        5133311pps 2463Mb/sec (2463989280bps) errors: 0
      Device: eth4@2
      Result: OK: 11791019(c11788249+d2769) usec, 59868091 (60byte,0frags)
        5077431pps 2437Mb/sec (2437166880bps) errors: 0
      Device: eth4@3
      Result: OK: 11795039(c11792403+d2636) usec, 59483181 (60byte,0frags)
        5043067pps 2420Mb/sec (2420672160bps) errors: 0
      
      perf report --no-children:
       26.05%  ksoftirqd/0  [mlx4_en]         [k] mlx4_en_process_rx_cq
       17.84%  ksoftirqd/0  [mlx4_en]         [k] mlx4_en_alloc_frags
        5.52%  ksoftirqd/0  [mlx4_en]         [k] mlx4_en_free_frag
        4.90%  swapper      [kernel.vmlinux]  [k] poll_idle
        4.14%  ksoftirqd/0  [kernel.vmlinux]  [k] get_page_from_freelist
        2.78%  ksoftirqd/0  [kernel.vmlinux]  [k] __free_pages_ok
        2.57%  ksoftirqd/0  [kernel.vmlinux]  [k] bpf_map_lookup_elem
        2.51%  swapper      [mlx4_en]         [k] mlx4_en_process_rx_cq
        1.94%  ksoftirqd/0  [kernel.vmlinux]  [k] percpu_array_map_lookup_elem
        1.45%  swapper      [mlx4_en]         [k] mlx4_en_alloc_frags
        1.35%  ksoftirqd/0  [kernel.vmlinux]  [k] free_one_page
        1.33%  swapper      [kernel.vmlinux]  [k] intel_idle
        1.04%  ksoftirqd/0  [mlx4_en]         [k] 0x000000000001c5c5
        0.96%  ksoftirqd/0  [mlx4_en]         [k] 0x000000000001c58d
        0.93%  ksoftirqd/0  [mlx4_en]         [k] 0x000000000001c6ee
        0.92%  ksoftirqd/0  [mlx4_en]         [k] 0x000000000001c6b9
        0.89%  ksoftirqd/0  [kernel.vmlinux]  [k] __alloc_pages_nodemask
        0.83%  ksoftirqd/0  [mlx4_en]         [k] 0x000000000001c686
        0.83%  ksoftirqd/0  [mlx4_en]         [k] 0x000000000001c5d5
        0.78%  ksoftirqd/0  [mlx4_en]         [k] mlx4_alloc_pages.isra.23
        0.77%  ksoftirqd/0  [mlx4_en]         [k] 0x000000000001c5b4
        0.77%  ksoftirqd/0  [kernel.vmlinux]  [k] net_rx_action
      
      machine specs:
       receiver - Intel E5-1630 v3 @ 3.70GHz
       sender - Intel E5645 @ 2.40GHz
       Mellanox ConnectX-3 @40G
      Signed-off-by: NBrenden Blanco <bblanco@plumgrid.com>
      Acked-by: NAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      86af8b41
  4. 02 7月, 2016 1 次提交
    • M
      cgroup: bpf: Add an example to do cgroup checking in BPF · a3f74617
      Martin KaFai Lau 提交于
      test_cgrp2_array_pin.c:
      A userland program that creates a bpf_map (BPF_MAP_TYPE_GROUP_ARRAY),
      pouplates/updates it with a cgroup2's backed fd and pins it to a
      bpf-fs's file.  The pinned file can be loaded by tc and then used
      by the bpf prog later.  This program can also update an existing pinned
      array and it could be useful for debugging/testing purpose.
      
      test_cgrp2_tc_kern.c:
      A bpf prog which should be loaded by tc.  It is to demonstrate
      the usage of bpf_skb_in_cgroup.
      
      test_cgrp2_tc.sh:
      A script that glues the test_cgrp2_array_pin.c and
      test_cgrp2_tc_kern.c together.  The idea is like:
      1. Load the test_cgrp2_tc_kern.o by tc
      2. Use test_cgrp2_array_pin.c to populate a BPF_MAP_TYPE_CGROUP_ARRAY
         with a cgroup fd
      3. Do a 'ping -6 ff02::1%ve' to ensure the packet has been
         dropped because of a match on the cgroup
      
      Most of the lines in test_cgrp2_tc.sh is the boilerplate
      to setup the cgroup/bpf-fs/net-devices/netns...etc.  It is
      not bulletproof on errors but should work well enough and
      give enough debug info if things did not go well.
      Signed-off-by: NMartin KaFai Lau <kafai@fb.com>
      Cc: Alexei Starovoitov <ast@fb.com>
      Cc: Daniel Borkmann <daniel@iogearbox.net>
      Cc: Tejun Heo <tj@kernel.org>
      Acked-by: NAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a3f74617
  5. 26 6月, 2016 1 次提交
  6. 07 5月, 2016 2 次提交
  7. 30 4月, 2016 5 次提交
  8. 29 4月, 2016 1 次提交
  9. 15 4月, 2016 2 次提交
    • D
      bpf, samples: add test cases for raw stack · 3f2050e2
      Daniel Borkmann 提交于
      This adds test cases mostly around ARG_PTR_TO_RAW_STACK to check the
      verifier behaviour.
      
        [...]
        #84 raw_stack: no skb_load_bytes OK
        #85 raw_stack: skb_load_bytes, no init OK
        #86 raw_stack: skb_load_bytes, init OK
        #87 raw_stack: skb_load_bytes, spilled regs around bounds OK
        #88 raw_stack: skb_load_bytes, spilled regs corruption OK
        #89 raw_stack: skb_load_bytes, spilled regs corruption 2 OK
        #90 raw_stack: skb_load_bytes, spilled regs + data OK
        #91 raw_stack: skb_load_bytes, invalid access 1 OK
        #92 raw_stack: skb_load_bytes, invalid access 2 OK
        #93 raw_stack: skb_load_bytes, invalid access 3 OK
        #94 raw_stack: skb_load_bytes, invalid access 4 OK
        #95 raw_stack: skb_load_bytes, invalid access 5 OK
        #96 raw_stack: skb_load_bytes, invalid access 6 OK
        #97 raw_stack: skb_load_bytes, large access OK
        Summary: 98 PASSED, 0 FAILED
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: NAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      3f2050e2
    • D
      bpf, samples: don't zero data when not needed · 02413cab
      Daniel Borkmann 提交于
      Remove the zero initialization in the sample programs where appropriate.
      Note that this is an optimization which is now possible, old programs
      still doing the zero initialization are just fine as well. Also, make
      sure we don't have padding issues when we don't memset() the entire
      struct anymore.
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: NAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      02413cab
  10. 08 4月, 2016 3 次提交
  11. 07 4月, 2016 3 次提交
    • N
      samples/bpf: Enable powerpc support · 138d6153
      Naveen N. Rao 提交于
      Add the necessary definitions for building bpf samples on ppc.
      
      Since ppc doesn't store function return address on the stack, modify how
      PT_REGS_RET() and PT_REGS_FP() work.
      
      Also, introduce PT_REGS_IP() to access the instruction pointer.
      
      Cc: Alexei Starovoitov <ast@fb.com>
      Cc: Daniel Borkmann <daniel@iogearbox.net>
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Signed-off-by: NNaveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
      Acked-by: NAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      138d6153
    • N
      samples/bpf: Use llc in PATH, rather than a hardcoded value · 128d1514
      Naveen N. Rao 提交于
      While at it, remove the generation of .s files and fix some typos in the
      related comment.
      
      Cc: Alexei Starovoitov <ast@fb.com>
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Daniel Borkmann <daniel@iogearbox.net>
      Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Signed-off-by: NNaveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
      Acked-by: NAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      128d1514
    • N
      samples/bpf: Fix build breakage with map_perf_test_user.c · 77e63534
      Naveen N. Rao 提交于
      Building BPF samples is failing with the below error:
      
      samples/bpf/map_perf_test_user.c: In function ‘main’:
      samples/bpf/map_perf_test_user.c:134:9: error: variable ‘r’ has
      initializer but incomplete type
        struct rlimit r = {RLIM_INFINITY, RLIM_INFINITY};
               ^
      samples/bpf/map_perf_test_user.c:134:21: error: ‘RLIM_INFINITY’
      undeclared (first use in this function)
        struct rlimit r = {RLIM_INFINITY, RLIM_INFINITY};
                           ^
      samples/bpf/map_perf_test_user.c:134:21: note: each undeclared
      identifier is reported only once for each function it appears in
      samples/bpf/map_perf_test_user.c:134:9: warning: excess elements in
      struct initializer [enabled by default]
        struct rlimit r = {RLIM_INFINITY, RLIM_INFINITY};
               ^
      samples/bpf/map_perf_test_user.c:134:9: warning: (near initialization
      for ‘r’) [enabled by default]
      samples/bpf/map_perf_test_user.c:134:9: warning: excess elements in
      struct initializer [enabled by default]
      samples/bpf/map_perf_test_user.c:134:9: warning: (near initialization
      for ‘r’) [enabled by default]
      samples/bpf/map_perf_test_user.c:134:16: error: storage size of ‘r’
      isn’t known
        struct rlimit r = {RLIM_INFINITY, RLIM_INFINITY};
                      ^
      samples/bpf/map_perf_test_user.c:139:2: warning: implicit declaration of
      function ‘setrlimit’ [-Wimplicit-function-declaration]
        setrlimit(RLIMIT_MEMLOCK, &r);
        ^
      samples/bpf/map_perf_test_user.c:139:12: error: ‘RLIMIT_MEMLOCK’
      undeclared (first use in this function)
        setrlimit(RLIMIT_MEMLOCK, &r);
                  ^
      samples/bpf/map_perf_test_user.c:134:16: warning: unused variable ‘r’
      [-Wunused-variable]
        struct rlimit r = {RLIM_INFINITY, RLIM_INFINITY};
                      ^
      make[2]: *** [samples/bpf/map_perf_test_user.o] Error 1
      
      Fix this by including the necessary header file.
      
      Cc: Alexei Starovoitov <ast@fb.com>
      Cc: Daniel Borkmann <daniel@iogearbox.net>
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Acked-by: NAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: NNaveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      77e63534
  12. 09 3月, 2016 7 次提交
  13. 20 2月, 2016 1 次提交
    • A
      samples/bpf: offwaketime example · a6ffe7b9
      Alexei Starovoitov 提交于
      This is simplified version of Brendan Gregg's offwaketime:
      This program shows kernel stack traces and task names that were blocked and
      "off-CPU", along with the stack traces and task names for the threads that woke
      them, and the total elapsed time from when they blocked to when they were woken
      up. The combined stacks, task names, and total time is summarized in kernel
      context for efficiency.
      
      Example:
      $ sudo ./offwaketime | flamegraph.pl > demo.svg
      Open demo.svg in the browser as FlameGraph visualization.
      Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a6ffe7b9
  14. 06 2月, 2016 3 次提交
  15. 17 11月, 2015 1 次提交
  16. 03 11月, 2015 1 次提交
    • D
      bpf: add sample usages for persistent maps/progs · 42984d7c
      Daniel Borkmann 提交于
      This patch adds a couple of stand-alone examples on how BPF_OBJ_PIN
      and BPF_OBJ_GET commands can be used.
      
      Example with maps:
      
        # ./fds_example -F /sys/fs/bpf/m -P -m -k 1 -v 42
        bpf: map fd:3 (Success)
        bpf: pin ret:(0,Success)
        bpf: fd:3 u->(1:42) ret:(0,Success)
        # ./fds_example -F /sys/fs/bpf/m -G -m -k 1
        bpf: get fd:3 (Success)
        bpf: fd:3 l->(1):42 ret:(0,Success)
        # ./fds_example -F /sys/fs/bpf/m -G -m -k 1 -v 24
        bpf: get fd:3 (Success)
        bpf: fd:3 u->(1:24) ret:(0,Success)
        # ./fds_example -F /sys/fs/bpf/m -G -m -k 1
        bpf: get fd:3 (Success)
        bpf: fd:3 l->(1):24 ret:(0,Success)
      
        # ./fds_example -F /sys/fs/bpf/m2 -P -m
        bpf: map fd:3 (Success)
        bpf: pin ret:(0,Success)
        # ./fds_example -F /sys/fs/bpf/m2 -G -m -k 1
        bpf: get fd:3 (Success)
        bpf: fd:3 l->(1):0 ret:(0,Success)
        # ./fds_example -F /sys/fs/bpf/m2 -G -m
        bpf: get fd:3 (Success)
      
      Example with progs:
      
        # ./fds_example -F /sys/fs/bpf/p -P -p
        bpf: prog fd:3 (Success)
        bpf: pin ret:(0,Success)
        bpf sock:4 <- fd:3 attached ret:(0,Success)
        # ./fds_example -F /sys/fs/bpf/p -G -p
        bpf: get fd:3 (Success)
        bpf: sock:4 <- fd:3 attached ret:(0,Success)
      
        # ./fds_example -F /sys/fs/bpf/p2 -P -p -o ./sockex1_kern.o
        bpf: prog fd:5 (Success)
        bpf: pin ret:(0,Success)
        bpf: sock:3 <- fd:5 attached ret:(0,Success)
        # ./fds_example -F /sys/fs/bpf/p2 -G -p
        bpf: get fd:3 (Success)
        bpf: sock:4 <- fd:3 attached ret:(0,Success)
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: NAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      42984d7c
  17. 28 10月, 2015 1 次提交
  18. 22 10月, 2015 1 次提交
  19. 13 10月, 2015 1 次提交
    • A
      bpf: add unprivileged bpf tests · bf508877
      Alexei Starovoitov 提交于
      Add new tests samples/bpf/test_verifier:
      
      unpriv: return pointer
        checks that pointer cannot be returned from the eBPF program
      
      unpriv: add const to pointer
      unpriv: add pointer to pointer
      unpriv: neg pointer
        checks that pointer arithmetic is disallowed
      
      unpriv: cmp pointer with const
      unpriv: cmp pointer with pointer
        checks that comparison of pointers is disallowed
        Only one case allowed 'void *value = bpf_map_lookup_elem(..); if (value == 0) ...'
      
      unpriv: check that printk is disallowed
        since bpf_trace_printk is not available to unprivileged
      
      unpriv: pass pointer to helper function
        checks that pointers cannot be passed to functions that expect integers
        If function expects a pointer the verifier allows only that type of pointer.
        Like 1st argument of bpf_map_lookup_elem() must be pointer to map.
        (applies to non-root as well)
      
      unpriv: indirectly pass pointer on stack to helper function
        checks that pointer stored into stack cannot be used as part of key
        passed into bpf_map_lookup_elem()
      
      unpriv: mangle pointer on stack 1
      unpriv: mangle pointer on stack 2
        checks that writing into stack slot that already contains a pointer
        is disallowed
      
      unpriv: read pointer from stack in small chunks
        checks that < 8 byte read from stack slot that contains a pointer is
        disallowed
      
      unpriv: write pointer into ctx
        checks that storing pointers into skb->fields is disallowed
      
      unpriv: write pointer into map elem value
        checks that storing pointers into element values is disallowed
        For example:
        int bpf_prog(struct __sk_buff *skb)
        {
          u32 key = 0;
          u64 *value = bpf_map_lookup_elem(&map, &key);
          if (value)
             *value = (u64) skb;
        }
        will be rejected.
      
      unpriv: partial copy of pointer
        checks that doing 32-bit register mov from register containing
        a pointer is disallowed
      
      unpriv: pass pointer to tail_call
        checks that passing pointer as an index into bpf_tail_call
        is disallowed
      
      unpriv: cmp map pointer with zero
        checks that comparing map pointer with constant is disallowed
      
      unpriv: write into frame pointer
        checks that frame pointer is read-only (applies to root too)
      
      unpriv: cmp of frame pointer
        checks that R10 cannot be using in comparison
      
      unpriv: cmp of stack pointer
        checks that Rx = R10 - imm is ok, but comparing Rx is not
      
      unpriv: obfuscate stack pointer
        checks that Rx = R10 - imm is ok, but Rx -= imm is not
      Signed-off-by: NAlexei Starovoitov <ast@plumgrid.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      bf508877
  20. 18 9月, 2015 1 次提交
    • A
      bpf: add bpf_redirect() helper · 27b29f63
      Alexei Starovoitov 提交于
      Existing bpf_clone_redirect() helper clones skb before redirecting
      it to RX or TX of destination netdev.
      Introduce bpf_redirect() helper that does that without cloning.
      
      Benchmarked with two hosts using 10G ixgbe NICs.
      One host is doing line rate pktgen.
      Another host is configured as:
      $ tc qdisc add dev $dev ingress
      $ tc filter add dev $dev root pref 10 u32 match u32 0 0 flowid 1:2 \
         action bpf run object-file tcbpf1_kern.o section clone_redirect_xmit drop
      so it receives the packet on $dev and immediately xmits it on $dev + 1
      The section 'clone_redirect_xmit' in tcbpf1_kern.o file has the program
      that does bpf_clone_redirect() and performance is 2.0 Mpps
      
      $ tc filter add dev $dev root pref 10 u32 match u32 0 0 flowid 1:2 \
         action bpf run object-file tcbpf1_kern.o section redirect_xmit drop
      which is using bpf_redirect() - 2.4 Mpps
      
      and using cls_bpf with integrated actions as:
      $ tc filter add dev $dev root pref 10 \
        bpf run object-file tcbpf1_kern.o section redirect_xmit integ_act classid 1
      performance is 2.5 Mpps
      
      To summarize:
      u32+act_bpf using clone_redirect - 2.0 Mpps
      u32+act_bpf using redirect - 2.4 Mpps
      cls_bpf using redirect - 2.5 Mpps
      
      For comparison linux bridge in this setup is doing 2.1 Mpps
      and ixgbe rx + drop in ip_rcv - 7.8 Mpps
      Signed-off-by: NAlexei Starovoitov <ast@plumgrid.com>
      Acked-by: NDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: NJohn Fastabend <john.r.fastabend@intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      27b29f63