提交 · 81f522f96f788ba978f505d5a03a039f75360b8b · openeuler / Kernel

16 7月, 2019 1 次提交

samples/bpf: build with -D__TARGET_ARCH_$(SRCARCH) · 81f522f9

由 Ilya Leoshkevich 提交于 7月 15, 2019

While $ARCH can be relatively flexible (see Makefile and
tools/scripts/Makefile.arch), $SRCARCH always corresponds to a directory
name under arch/.

Therefore, build samples with -D__TARGET_ARCH_$(SRCARCH), since that
matches the expectations of bpf_helpers.h.
Signed-off-by: NIlya Leoshkevich <iii@linux.ibm.com>
Acked-by: NVasily Gorbik <gor@linux.ibm.com>
Acked-by: NAndrii Nakryiko <andriin@fb.com>
Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>

81f522f9

03 7月, 2019 3 次提交

samples/bpf: fix tcp_bpf.readme detach command · d78e3f06

由 Stanislav Fomichev 提交于 7月 02, 2019

Copy-paste, should be detach, not attach.
Signed-off-by: NStanislav Fomichev <sdf@google.com>
Acked-by: NSoheil Hassas Yeganeh <soheil@google.com>
Acked-by: NYuchung Cheng <ycheng@google.com>
Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>

d78e3f06

samples/bpf: add sample program that periodically dumps TCP stats · 39533884

由 Stanislav Fomichev 提交于 7月 02, 2019

Uses new RTT callback to dump stats every second.

$ mkdir -p /tmp/cgroupv2
$ mount -t cgroup2 none /tmp/cgroupv2
$ mkdir -p /tmp/cgroupv2/foo
$ echo $$ >> /tmp/cgroupv2/foo/cgroup.procs
$ bpftool prog load ./tcp_dumpstats_kern.o /sys/fs/bpf/tcp_prog
$ bpftool cgroup attach /tmp/cgroupv2/foo sock_ops pinned /sys/fs/bpf/tcp_prog
$ bpftool prog tracelog
$ # run neper/netperf/etc

Used neper to compare performance with and without this program attached
and didn't see any noticeable performance impact.

Sample output:
  <idle>-0     [015] ..s.  2074.128800: 0: dsack_dups=0 delivered=242526
  <idle>-0     [015] ..s.  2074.128808: 0: delivered_ce=0 icsk_retransmits=0
  <idle>-0     [015] ..s.  2075.130133: 0: dsack_dups=0 delivered=323599
  <idle>-0     [015] ..s.  2075.130138: 0: delivered_ce=0 icsk_retransmits=0
  <idle>-0     [005] .Ns.  2076.131440: 0: dsack_dups=0 delivered=404648
  <idle>-0     [005] .Ns.  2076.131447: 0: delivered_ce=0 icsk_retransmits=0

Cc: Eric Dumazet <edumazet@google.com>
Cc: Priyaranjan Jha <priyarjha@google.com>
Cc: Yuchung Cheng <ycheng@google.com>
Cc: Soheil Hassas Yeganeh <soheil@google.com>
Acked-by: NSoheil Hassas Yeganeh <soheil@google.com>
Acked-by: NYuchung Cheng <ycheng@google.com>
Signed-off-by: NStanislav Fomichev <sdf@google.com>
Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>

39533884

bpf: Add support for fq's EDT to HBM · 71634d7f

由 brakmo 提交于 7月 02, 2019

Adds support for fq's Earliest Departure Time to HBM (Host Bandwidth
Manager). Includes a new BPF program supporting EDT, and also updates
corresponding programs.

It will drop packets with an EDT of more than 500us in the future
unless the packet belongs to a flow with less than 2 packets in flight.
This is done so each flow has at least 2 packets in flight, so they
will not starve, and also to help prevent delayed ACK timeouts.

It will also work with ECN enabled traffic, where the packets will be
CE marked if their EDT is more than 50us in the future.

The table below shows some performance numbers. The flows are back to
back RPCS. One server sending to another, either 2 or 4 flows.
One flow is a 10KB RPC, the rest are 1MB RPCs. When there are more
than one flow of a given RPC size, the numbers represent averages.

The rate limit applies to all flows (they are in the same cgroup).
Tests ending with "-edt" ran with the new BPF program supporting EDT.
Tests ending with "-hbt" ran on top HBT qdisc with the specified rate
(i.e. no HBM). The other tests ran with the HBM BPF program included
in the HBM patch-set.

EDT has limited value when using DCTCP, but it helps in many cases when
using Cubic. It usually achieves larger link utilization and lower
99% latencies for the 1MB RPCs.
HBM ends up queueing a lot of packets with its default parameter values,
reducing the goodput of the 10KB RPCs and increasing their latency. Also,
the RTTs seen by the flows are quite large.

Aggr 10K 10K 10K 1MB 1MB 1MB
Limit rate drops RTT rate P90 P99 rate P90 P99
Test rate Flows Mbps % us Mbps us us Mbps ms ms
-------- ---- ----- ---- ----- --- ---- ---- ---- ---- ---- ----
cubic 1G 2 904 0.02 108 257 511 539 647 13.4 24.5
cubic-edt 1G 2 982 0.01 156 239 656 967 743 14.0 17.2
dctcp 1G 2 977 0.00 105 324 408 744 653 14.5 15.9
dctcp-edt 1G 2 981 0.01 142 321 417 811 660 15.7 17.0
cubic-htb 1G 2 919 0.00 1825 40 2822 4140 879 9.7 9.9

cubic 200M 2 155 0.30 220 81 532 655 74 283 450
cubic-edt 200M 2 188 0.02 222 87 1035 1095 101 84 85
dctcp 200M 2 188 0.03 111 77 912 939 111 76 325
dctcp-edt 200M 2 188 0.03 217 74 1416 1738 114 76 79
cubic-htb 200M 2 188 0.00 5015 8 14ms 15ms 180 48 50

cubic 1G 4 952 0.03 110 165 516 546 262 38 154
cubic-edt 1G 4 973 0.01 190 111 1034 1314 287 65 79
dctcp 1G 4 951 0.00 103 180 617 905 257 37 38
dctcp-edt 1G 4 967 0.00 163 151 732 1126 272 43 55
cubic-htb 1G 4 914 0.00 3249 13 7ms 8ms 300 29 34

cubic 5G 4 4236 0.00 134 305 490 624 1310 10 17
cubic-edt 5G 4 4865 0.00 156 306 425 759 1520 10 16
dctcp 5G 4 4936 0.00 128 485 221 409 1484 7 9
dctcp-edt 5G 4 4924 0.00 148 390 392 623 1508 11 26

v1 -> v2: Incorporated Andrii's suggestions
v2 -> v3: Incorporated Yonghong's suggestions
v3 -> v4: Removed credit update that is not needed
Signed-off-by: NLawrence Brakmo <brakmo@fb.com>
Acked-by: NYonghong Song <yhs@fb.com>
Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>

71634d7f

28 6月, 2019 1 次提交

xsk: Change the default frame size to 4096 and allow controlling it · 123e8da1

由 Maxim Mikityanskiy 提交于 6月 26, 2019

The typical XDP memory scheme is one packet per page. Change the AF_XDP
frame size in libbpf to 4096, which is the page size on x86, to allow
libbpf to be used with the drivers with the packet-per-page scheme.

Add a command line option -f to xdpsock to allow to specify a custom
frame size.
Signed-off-by: NMaxim Mikityanskiy <maximmi@mellanox.com>
Signed-off-by: NTariq Toukan <tariqt@mellanox.com>
Acked-by: NSaeed Mahameed <saeedm@mellanox.com>
Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>

123e8da1

26 6月, 2019 1 次提交

samples: bpf: make the use of xdp samples consistent · 9e859e8f

由 Daniel T. Lee 提交于 6月 25, 2019

Currently, each xdp samples are inconsistent in the use.
Most of the samples fetch the interface with it's name.
(ex. xdp1, xdp2skb, xdp_redirect_cpu, xdp_sample_pkts, etc.)

But some of the xdp samples are fetching the interface with
ifindex by command argument.

This commit enables xdp samples to fetch interface with it's name
without changing the original index interface fetching.
(<ifname|ifindex> fetching in the same way as xdp_sample_pkts_user.c does.)
Signed-off-by: NDaniel T. Lee <danieltimlee@gmail.com>
Acked-by: NToke Høiland-Jørgensen <toke@redhat.com>
Acked-by: NJesper Dangaard Brouer <brouer@redhat.com>
Acked-by: NSong Liu <songliubraving@fb.com>
Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>

9e859e8f

25 6月, 2019 1 次提交

samples: bpf: Remove bpf_debug macro in favor of bpf_printk · 7ae9f281

由 Michal Rostecki 提交于 6月 18, 2019

ibumad example was implementing the bpf_debug macro which is exactly the
same as the bpf_printk macro available in bpf_helpers.h. This change
makes use of bpf_printk instead of bpf_debug.
Signed-off-by: NMichal Rostecki <mrostecki@opensuse.org>
Acked-by: NAndrii Nakryiko <andriin@fb.com>
Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
Signed-off-by: NAlexei Starovoitov <ast@kernel.org>

7ae9f281

24 6月, 2019 1 次提交

samples/bpf: xdp_redirect, correctly get dummy program id · 20f6239d

由 Prashant Bhole 提交于 6月 20, 2019

When we terminate xdp_redirect, it ends up with following message:
"Program on iface OUT changed, not removing"
This results in dummy prog still attached to OUT interface.
It is because signal handler checks if the programs are the same that
we had attached. But while fetching dummy_prog_id, current code uses
prog_fd instead of dummy_prog_fd. This patch passes the correct fd.

Fixes: 3b7a8ec2 ("samples/bpf: Check the prog id before exiting")
Signed-off-by: NPrashant Bhole <prashantbhole.linux@gmail.com>
Acked-by: NToke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>

20f6239d

19 6月, 2019 1 次提交

treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 505 · 7f904d7e

由 Thomas Gleixner 提交于 6月 04, 2019

Based on 1 normalized pattern(s):

  gplv2

extracted by the scancode license scanner the SPDX license identifier

  GPL-2.0-only

has been chosen to replace the boilerplate/reference in 58 file(s).
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Reviewed-by: NEnrico Weigelt <info@metux.net>
Reviewed-by: NAllison Randal <allison@lohutok.net>
Reviewed-by: NKate Stewart <kstewart@linuxfoundation.org>
Cc: linux-spdx@vger.kernel.org
Link: https://lkml.kernel.org/r/20190604081207.556988620@linutronix.deSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

7f904d7e

18 6月, 2019 2 次提交

samples: bpf: refactor header include path · 4d18f6de

由 Daniel T. Lee 提交于 6月 16, 2019

Currently, header inclusion in each file is inconsistent.
For example, "libbpf.h" header is included as multiple ways.

    #include "bpf/libbpf.h"
    #include "libbpf.h"

Due to commit b552d33c ("samples/bpf: fix include path
in Makefile"), $(srctree)/tools/lib/bpf/ path had been included
during build, path "bpf/" in header isn't necessary anymore.

This commit removes path "bpf/" in header inclusion.
Signed-off-by: NDaniel T. Lee <danieltimlee@gmail.com>
Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>

4d18f6de

samples: bpf: remove unnecessary include options in Makefile · fa206dcc

由 Daniel T. Lee 提交于 6月 16, 2019

Due to recent change of include path at commit b552d33c
("samples/bpf: fix include path in Makefile"), some of the
previous include options became unnecessary.

This commit removes duplicated include options in Makefile.
Signed-off-by: NDaniel T. Lee <danieltimlee@gmail.com>
Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>

fa206dcc

15 6月, 2019 1 次提交

samples/bpf: fix include path in Makefile · b552d33c

由 Prashant Bhole 提交于 6月 14, 2019

Recent commit included libbpf.h in selftests/bpf/bpf_util.h.
Since some samples use bpf_util.h and samples/bpf/Makefile doesn't
have libbpf.h path included, build was failing. Let's add the path
in samples/bpf/Makefile.
Signed-off-by: NPrashant Bhole <prashantbhole.linux@gmail.com>
Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>

b552d33c

11 6月, 2019 1 次提交

samples: bpf: don't run probes at the local make stage · 0ed3cc4a

由 Jakub Kicinski 提交于 6月 07, 2019

Quentin reports that commit 07c3bbdb ("samples: bpf: print
a warning about headers_install") is producing the false
positive when make is invoked locally, from the samples/bpf/
directory.

When make is run locally it hits the "all" target, which
will recursively invoke make through the full build system.

Speed up the "local" run which doesn't actually build anything,
and avoid false positives by skipping all the probes if not in
kbuild environment (cover both the new warning and the BTF
probes).
Reported-by: NQuentin Monnet <quentin.monnet@netronome.com>
Signed-off-by: NJakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: NQuentin Monnet <quentin.monnet@netronome.com>
Signed-off-by: NAlexei Starovoitov <ast@kernel.org>

0ed3cc4a

06 6月, 2019 1 次提交

samples: bpf: print a warning about headers_install · 07c3bbdb

由 Jakub Kicinski 提交于 6月 05, 2019

It seems like periodically someone posts patches to "fix"
header includes.  The issue is that samples expect the
include path to have the uAPI headers (from usr/) first,
and then tools/ headers, so that locally installed uAPI
headers take precedence.  This means that if users didn't
run headers_install they will see all sort of strange
compilation errors, e.g.:

  HOSTCC  samples/bpf/test_lru_dist
  samples/bpf/test_lru_dist.c:39:8: error: redefinition of ‘struct list_head’
   struct list_head {
          ^~~~~~~~~
   In file included from samples/bpf/test_lru_dist.c:9:0:
   ../tools/include/linux/types.h:69:8: note: originally defined here
    struct list_head {
           ^~~~~~~~~

Try to detect this situation, and print a helpful warning.

v2: just use HOSTCC (Jiong).
Signed-off-by: NJakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: NQuentin Monnet <quentin.monnet@netronome.com>
Acked-by: NMartin KaFai Lau <kafai@fb.com>
Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>

07c3bbdb

05 6月, 2019 1 次提交

treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 295 · 5b497af4

由 Thomas Gleixner 提交于 5月 29, 2019

Based on 1 normalized pattern(s):

  this program is free software you can redistribute it and or modify
  it under the terms of version 2 of the gnu general public license as
  published by the free software foundation this program is
  distributed in the hope that it will be useful but without any
  warranty without even the implied warranty of merchantability or
  fitness for a particular purpose see the gnu general public license
  for more details

extracted by the scancode license scanner the SPDX license identifier

  GPL-2.0-only

has been chosen to replace the boilerplate/reference in 64 file(s).
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Reviewed-by: NAlexios Zavras <alexios.zavras@intel.com>
Reviewed-by: NAllison Randal <allison@lohutok.net>
Cc: linux-spdx@vger.kernel.org
Link: https://lkml.kernel.org/r/20190529141901.894819585@linutronix.deSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

5b497af4

04 6月, 2019 1 次提交

bpf: hbm: fix spelling mistake "notifcations" -> "notificiations" · 2ed99339

由 Colin Ian King 提交于 6月 03, 2019

There is a spelling mistake in the help information, fix this.
Signed-off-by: NColin Ian King <colin.king@canonical.com>
Acked-by: NMartin KaFai Lau <kafai@fb.com>
Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>

2ed99339

01 6月, 2019 2 次提交

bpf: Add more stats to HBM · d58c6f72

由 brakmo 提交于 5月 28, 2019

Adds more stats to HBM, including average cwnd and rtt of all TCP
flows, percents of packets that are ecn ce marked and distribution
of return values.
Signed-off-by: NLawrence Brakmo <brakmo@fb.com>
Signed-off-by: NAlexei Starovoitov <ast@kernel.org>

d58c6f72

bpf: Add cn support to hbm_out_kern.c · ffd81558

由 brakmo 提交于 5月 28, 2019

Update hbm_out_kern.c to support returning cn notifications.
Also updates relevant files to allow disabling cn notifications.
Signed-off-by: NLawrence Brakmo <brakmo@fb.com>
Signed-off-by: NAlexei Starovoitov <ast@kernel.org>

ffd81558

31 5月, 2019 1 次提交

treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 206 · 25763b3c

由 Thomas Gleixner 提交于 5月 28, 2019

Based on 1 normalized pattern(s):

  this program is free software you can redistribute it and or modify
  it under the terms of version 2 of the gnu general public license as
  published by the free software foundation

extracted by the scancode license scanner the SPDX license identifier

  GPL-2.0-only

has been chosen to replace the boilerplate/reference in 107 file(s).
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Reviewed-by: NAllison Randal <allison@lohutok.net>
Reviewed-by: NRichard Fontana <rfontana@redhat.com>
Reviewed-by: NSteve Winslow <swinslow@gmail.com>
Reviewed-by: NAlexios Zavras <alexios.zavras@intel.com>
Cc: linux-spdx@vger.kernel.org
Link: https://lkml.kernel.org/r/20190528171438.615055994@linutronix.deSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

25763b3c

29 5月, 2019 1 次提交

selftests/bpf: convert test_cgrp2_attach2 example into kselftest · ba0c0cc0

由 Roman Gushchin 提交于 5月 25, 2019

Convert test_cgrp2_attach2 example into a proper test_cgroup_attach
kselftest. It's better because we do run kselftest on a constant
basis, so there are better chances to spot a potential regression.

Also make it slightly less verbose to conform kselftests output style.

Output example:
  $ ./test_cgroup_attach
  #override:PASS
  #multi:PASS
  test_cgroup_attach:PASS
Signed-off-by: NRoman Gushchin <guro@fb.com>
Acked-by: NYonghong Song <yhs@fb.com>
Signed-off-by: NAlexei Starovoitov <ast@kernel.org>

ba0c0cc0

28 5月, 2019 1 次提交

samples/bpf: fix a couple of style issues in bpf_load · 37b54aed

由 Daniel T. Lee 提交于 5月 23, 2019

This commit fixes a few style problems in samples/bpf/bpf_load.c:

 - Magic string use of 'DEBUGFS'
 - Useless zero initialization of a global variable
 - Minor style fix with whitespace
Signed-off-by: NDaniel T. Lee <danieltimlee@gmail.com>
Acked-by: NYonghong Song <yhs@fb.com>
Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>

37b54aed

25 5月, 2019 2 次提交

samples: bpf: add ibumad sample to .gitignore · d9a6f413

由 Matteo Croce 提交于 5月 24, 2019

This commit adds ibumad to .gitignore which is
currently ommited from the ignore file.
Signed-off-by: NMatteo Croce <mcroce@redhat.com>
Signed-off-by: NAlexei Starovoitov <ast@kernel.org>

d9a6f413

samples: bpf: Do not define bpf_printk macro · c87f60a7

由 Michal Rostecki 提交于 5月 23, 2019

The bpf_printk macro was moved to bpf_helpers.h which is included in all
example programs.
Signed-off-by: NMichal Rostecki <mrostecki@opensuse.org>
Signed-off-by: NAlexei Starovoitov <ast@kernel.org>

c87f60a7

21 5月, 2019 2 次提交

samples, bpf: suppress compiler warning · a195ceff

由 Matteo Croce 提交于 5月 20, 2019

GCC 9 fails to calculate the size of local constant strings and produces a
false positive:

samples/bpf/task_fd_query_user.c: In function ‘test_debug_fs_uprobe’:
samples/bpf/task_fd_query_user.c:242:67: warning: ‘%s’ directive output may be truncated writing up to 255 bytes into a region of size 215 [-Wformat-truncation=]
  242 |  snprintf(buf, sizeof(buf), "/sys/kernel/debug/tracing/events/%ss/%s/id",
      |                                                                   ^~
  243 |    event_type, event_alias);
      |                ~~~~~~~~~~~
samples/bpf/task_fd_query_user.c:242:2: note: ‘snprintf’ output between 45 and 300 bytes into a destination of size 256
  242 |  snprintf(buf, sizeof(buf), "/sys/kernel/debug/tracing/events/%ss/%s/id",
      |  ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  243 |    event_type, event_alias);
      |    ~~~~~~~~~~~~~~~~~~~~~~~~

Workaround this by lowering the buffer size to a reasonable value.
Related GCC Bugzilla: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83431Signed-off-by: NMatteo Croce <mcroce@redhat.com>
Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>

a195ceff

samples, bpf: fix to change the buffer size for read() · f7c2d64b

由 Chang-Hsien Tsai 提交于 5月 19, 2019

If the trace for read is larger than 4096, the return
value sz will be 4096. This results in off-by-one error
on buf:

    static char buf[4096];
    ssize_t sz;

    sz = read(trace_fd, buf, sizeof(buf));
    if (sz > 0) {
        buf[sz] = 0;
        puts(buf);
    }
Signed-off-by: NChang-Hsien Tsai <luke.tw@gmail.com>
Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>

f7c2d64b

26 4月, 2019 1 次提交

samples: bpf: add hbm sample to .gitignore · ead442a0

由 Daniel T. Lee 提交于 4月 25, 2019

This commit adds hbm to .gitignore which is
currently ommited from the ignore file.
Signed-off-by: NDaniel T. Lee <danieltimlee@gmail.com>
Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>

ead442a0

05 4月, 2019 1 次提交

samples/bpf: fix build with new clang · 636e78b1

由 Alexei Starovoitov 提交于 4月 04, 2019

clang started to error on invalid asm clobber usage in x86 headers
and many bpf program samples failed to build with the message:

  CLANG-bpf  /data/users/ast/bpf-next/samples/bpf/xdp_redirect_kern.o
In file included from /data/users/ast/bpf-next/samples/bpf/xdp_redirect_kern.c:14:
In file included from ../include/linux/in.h:23:
In file included from ../include/uapi/linux/in.h:24:
In file included from ../include/linux/socket.h:8:
In file included from ../include/linux/uio.h:14:
In file included from ../include/crypto/hash.h:16:
In file included from ../include/linux/crypto.h:26:
In file included from ../include/linux/uaccess.h:5:
In file included from ../include/linux/sched.h:15:
In file included from ../include/linux/sem.h:5:
In file included from ../include/uapi/linux/sem.h:5:
In file included from ../include/linux/ipc.h:9:
In file included from ../include/linux/refcount.h:72:
../arch/x86/include/asm/refcount.h:72:36: error: asm-specifier for input or output variable conflicts with asm clobber list
                                         r->refs.counter, e, "er", i, "cx");
                                                                      ^
../arch/x86/include/asm/refcount.h:86:27: error: asm-specifier for input or output variable conflicts with asm clobber list
                                         r->refs.counter, e, "cx");
                                                             ^
2 errors generated.

Override volatile() to workaround the problem.
Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>

636e78b1

04 4月, 2019 1 次提交

samples, selftests/bpf: add NULL check for ksym_search · e67b2c71

由 Daniel T. Lee 提交于 4月 04, 2019

Since, ksym_search added with verification logic for symbols existence,
it could return NULL when the kernel symbols are not loaded.

This commit will add NULL check logic after ksym_search.
Signed-off-by: NDaniel T. Lee <danieltimlee@gmail.com>
Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>

e67b2c71

28 3月, 2019 1 次提交

BPF: Add sample code for new ib_umad tracepoint · 0ac01feb

由 Ira Weiny 提交于 3月 19, 2019

Provide a count of class types for a summary of MAD packets.  The example
shows one way to filter the trace data based on management class.
Signed-off-by: NIra Weiny <ira.weiny@intel.com>
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>

0ac01feb

22 3月, 2019 1 次提交

samples: bpf: add xdp_sample_pkts to .gitignore · ab99e7a8

由 Daniel T. Lee 提交于 3月 20, 2019

This commit adds xdp_sample_pkts to .gitignore which is
currently ommited from the ignore file.
Signed-off-by: NDaniel T. Lee <danieltimlee@gmail.com>
Signed-off-by: NAlexei Starovoitov <ast@kernel.org>

ab99e7a8

07 3月, 2019 1 次提交

bpf: hbm: fix spelling mistake "deault" -> "default" · 5b4f21b2

由 Colin Ian King 提交于 3月 05, 2019

There are a couple of typos, fix these.
Signed-off-by: NColin Ian King <colin.king@canonical.com>
Acked-by: NSong Liu <songliubraving@fb.com>
Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>

5b4f21b2

03 3月, 2019 3 次提交

bpf: HBM test script · 4ffd44cf

由 brakmo 提交于 3月 01, 2019

Script for testing HBM (Host Bandwidth Manager) framework.
It creates a cgroup to use for testing and load a BPF program to limit
egress bandwidht. It then uses iperf3 or netperf to create
loads. The output is the goodput in Mbps (unless -D is used).

It can work on a single host using loopback or among two hosts (with netperf).
When using loopback, it is recommended to also introduce a delay of at least
1ms (-d=1), otherwise the assigned bandwidth is likely to be underutilized.

USAGE: $name [out] [-b=<prog>|--bpf=<prog>] [-c=<cc>|--cc=<cc>] [-D]
             [-d=<delay>|--delay=<delay>] [--debug] [-E]
             [-f=<#flows>|--flows=<#flows>] [-h] [-i=<id>|--id=<id >] [-l]
	     [-N] [-p=<port>|--port=<port>] [-P] [-q=<qdisc>]
             [-R] [-s=<server>|--server=<server] [--stats]
	     [-t=<time>|--time=<time>] [-w] [cubic|dctcp]
  Where:
    out               Egress (default egress)
    -b or --bpf       BPF program filename to load and attach.
                      Default is nrm_out_kern.o for egress,
    -c or -cc         TCP congestion control (cubic or dctcp)
    -d or --delay     Add a delay in ms using netem
    -D                In addition to the goodput in Mbps, it also outputs
                      other detailed information. This information is
                      test dependent (i.e. iperf3 or netperf).
    --debug           Print BPF trace buffer
    -E                Enable ECN (not required for dctcp)
    -f or --flows     Number of concurrent flows (default=1)
    -i or --id        cgroup id (an integer, default is 1)
    -l                Do not limit flows using loopback
    -N                Use netperf instead of iperf3
    -h                Help
    -p or --port      iperf3 port (default is 5201)
    -P                Use an iperf3 instance for each flow
    -q                Use the specified qdisc.
    -r or --rate      Rate in Mbps (default 1s 1Gbps)
    -R                Use TCP_RR for netperf. 1st flow has req
                      size of 10KB, rest of 1MB. Reply in all
                      cases is 1 byte.
                      More detailed output for each flow can be found
                      in the files netperf.<cg>.<flow>, where <cg> is the
                      cgroup id as specified with the -i flag, and <flow>
                      is the flow id starting at 1 and increasing by 1 for
                      flow (as specified by -f).
    -s or --server    hostname of netperf server. Used to create netperf
                      test traffic between to hosts (default is within host)
                      netserver must be running on the host.
    --stats           Get HBM stats (marked, dropped, etc.)
    -t or --time      duration of iperf3 in seconds (default=5)
    -w                Work conserving flag. cgroup can increase its
                      bandwidth beyond the rate limit specified
                      while there is available bandwidth. Current
                      implementation assumes there is only one NIC
                      (eth0), but can be extended to support multiple
                      NICs. This is just a proof of concept.
    cubic or dctcp    specify TCP CC to use

Examples:
 ./do_hbm_test.sh -l -d=1 -D --stats
     Runs a 5 second test, using a single iperf3 flow and with the default
     rate limit of 1Gbps and a delay of 1ms (using netem) using the default
     TCP congestion control on the loopback device (hence we use "-l" to
     enforce bandwidth limit on loopback device). Since no direction is
     specified, it defaults to egress. Since no TCP CC algorithm is
     specified it uses the system default (Cubic for this test).
     With no -D flag, only the value of the AGGREGATE OUTPUT would show.
     id refers to the cgroup id and is useful when running multi cgroup
     tests (supported by a future patch).
     This patchset does not support calling TCP's congesion window
     reduction, even when packets are dropped by the BPF program, resulting
     in a large number of packets dropped. It is recommended that the  current
     HBM implemenation only be used with ECN enabled flows. A future patch
     will add support for reducing TCP's cwnd and will increase the
     performance of non-ECN enabled flows.
   Output:
     Details for HBM in cgroup 1
     id:1
     rate_mbps:493
     duration:4.8 secs
     packets:11355
     bytes_MB:590
     pkts_dropped:4497
     bytes_dropped_MB:292
     pkts_marked_percent: 39.60
     bytes_marked_percent: 49.49
     pkts_dropped_percent: 39.60
     bytes_dropped_percent: 49.49
     PING AVG DELAY:2.075
     AGGREGATE_GOODPUT:505

./do_nrm_test.sh -l -d=1 -D --stats dctcp
     Same as above but using dctcp. Note that fewer bytes are dropped
     (0.01% vs. 49%).
   Output:
     Details for HBM in cgroup 1
     id:1
     rate_mbps:945
     duration:4.9 secs
     packets:16859
     bytes_MB:578
     pkts_dropped:1
     bytes_dropped_MB:0
     pkts_marked_percent: 28.74
     bytes_marked_percent: 45.15
     pkts_dropped_percent:  0.01
     bytes_dropped_percent:  0.01
     PING AVG DELAY:2.083
     AGGREGATE_GOODPUT:965

./do_nrm_test.sh -d=1 -D --stats
     As first example, but without limiting loopback device (i.e. no
     "-l" flag). Since there is no bandwidth limiting, no details for
     HBM are printed out.
   Output:
     Details for HBM in cgroup 1
     PING AVG DELAY:2.019
     AGGREGATE_GOODPUT:42655

./do_hbm.sh -l -d=1 -D --stats -f=2
     Uses iper3 and does 2 flows
./do_hbm.sh -l -d=1 -D --stats -f=4 -P
     Uses iperf3 and does 4 flows, each flow as a separate process.
./do_hbm.sh -l -d=1 -D --stats -f=4 -N
     Uses netperf, 4 flows
./do_hbm.sh -f=1 -r=2000 -t=5 -N -D --stats dctcp -s=<server-name>
     Uses netperf between two hosts. The remote host name is specified
     with -s= and you need to start the program netserver manually on
     the remote host. It will use 1 flow, a rate limit of 2Gbps and dctcp.
./do_hbm.sh -f=1 -r=2000 -t=5 -N -D --stats -w dctcp \
     -s=<server-name>
     As previous, but allows use of extra bandwidth. For this test the
     rate is 8Gbps vs. 1Gbps of the previous test.
Signed-off-by: NLawrence Brakmo <brakmo@fb.com>
Signed-off-by: NAlexei Starovoitov <ast@kernel.org>

4ffd44cf

bpf: User program for testing HBM · a1270fe9

由 brakmo 提交于 3月 01, 2019

The program nrm creates a cgroup and attaches a BPF program to the
cgroup for testing HBM (Host Bandwidth Manager) for egress traffic.
One still needs to create network traffic. This can be done through
netesto, netperf or iperf3.
A follow-up patch contains a script to create traffic.

USAGE: hbm [-d] [-l] [-n <id>] [-r <rate>] [-s] [-t <secs>]
           [-w] [-h] [prog]
  Where:
   -d        Print BPF trace debug buffer
   -l        Also limit flows doing loopback
   -n <#>    To create cgroup "/hbm#" and attach prog. Default is /nrm1
             This is convenient when testing HBM in more than 1 cgroup
   -r <rate> Rate limit in Mbps
   -s        Get HBM stats (marked, dropped, etc.)
   -t <time> Exit after specified seconds (deault is 0)
   -w        Work conserving flag. cgroup can increase its bandwidth
             beyond the rate limit specified while there is available
             bandwidth. Current implementation assumes there is only
             NIC (eth0), but can be extended to support multiple NICs.
             Currrently only supported for egress. Note, this is just
	     a proof of concept.
   -h        Print this info
   prog      BPF program file name. Name defaults to hbm_out_kern.o

More information about HBM can be found in the paper "BPF Host Resource
Management" presented at the 2018 Linux Plumbers Conference, Networking Track
(http://vger.kernel.org/lpc_net2018_talks/LPC%20BPF%20Network%20Resource%20Paper.pdf)
Signed-off-by: NLawrence Brakmo <brakmo@fb.com>
Signed-off-by: NAlexei Starovoitov <ast@kernel.org>

a1270fe9

bpf: Sample HBM BPF program to limit egress bw · 187d0738

由 brakmo 提交于 3月 01, 2019

A cgroup skb BPF program to limit cgroup output bandwidth.
It uses a modified virtual token bucket queue to limit average
egress bandwidth. The implementation uses credits instead of tokens.
Negative credits imply that queueing would have happened (this is
a virtual queue, so no queueing is done by it. However, queueing may
occur at the actual qdisc (which is not used for rate limiting).

This implementation uses 3 thresholds, one to start marking packets and
the other two to drop packets:
                                 CREDIT
       - <--------------------------|------------------------> +
             |    |          |      0
             |  Large pkt    |
             |  drop thresh  |
  Small pkt drop             Mark threshold
      thresh

The effect of marking depends on the type of packet:
a) If the packet is ECN enabled, then the packet is ECN ce marked.
   The current mark threshold is tuned for DCTCP.
c) Else, it is dropped if it is a large packet.

If the credit is below the drop threshold, the packet is dropped.
Note that dropping a packet through the BPF program does not trigger CWR
(Congestion Window Reduction) in TCP packets. A future patch will add
support for triggering CWR.

This BPF program actually uses 2 drop thresholds, one threshold
for larger packets (>= 120 bytes) and another for smaller packets. This
protects smaller packets such as SYNs, ACKs, etc.

The default bandwidth limit is set at 1Gbps but this can be changed by
a user program through a shared BPF map. In addition, by default this BPF
program does not limit connections using loopback. This behavior can be
overwritten by the user program. There is also an option to calculate
some statistics, such as percent of packets marked or dropped, which
the user program can access.

A latter patch provides such a program (hbm.c)
Signed-off-by: NLawrence Brakmo <brakmo@fb.com>
Signed-off-by: NAlexei Starovoitov <ast@kernel.org>

187d0738

02 3月, 2019 1 次提交

samples/bpf: silence compiler warning for xdpsock_user.c · b74e21ab

由 Yonghong Song 提交于 2月 28, 2019

Compiling xdpsock_user.c with 4.8.5, I hit the following
compilation warning:
    HOSTCC  samples/bpf/xdpsock_user.o
  /data/users/yhs/work/net-next/samples/bpf/xdpsock_user.c: In function ‘main’:
  /data/users/yhs/work/net-next/samples/bpf/xdpsock_user.c:449:6: warning: ‘idx_cq’ may be used unini
  tialized in this function [-Wmaybe-uninitialized]
    u32 idx_cq, idx_fq;
        ^
  /data/users/yhs/work/net-next/samples/bpf/xdpsock_user.c:606:7: warning: ‘idx_rx’ may be used unini
  tialized in this function [-Wmaybe-uninitialized]
     u32 idx_rx, idx_tx = 0;
         ^
  /data/users/yhs/work/net-next/samples/bpf/xdpsock_user.c:506:6: warning: ‘idx_rx’ may be used unini
  tialized in this function [-Wmaybe-uninitialized]
    u32 idx_rx, idx_fq = 0;

As an example, the code pattern looks like:
    u32 idx_cq;
    ...
    ret = xsk_ring_prod__reserve(&xsk->umem->fq, rcvd, &idx_fq);
    if (ret) {
      ...
    }
    ... idx_fq ...
The compiler warns since it does not know whether &idx_fq is assigned
or not inside the library function xsk_ring_prod__reserve().

Let us assign an initial value 0 to such auto variables to silence
compiler warning.

Fixes: 248c7f9c ("samples/bpf: convert xdpsock to use libbpf for AF_XDP access")
Signed-off-by: NYonghong Song <yhs@fb.com>
Acked-by: NJonathan Lemon <jonathan.lemon@gmail.com>
Acked-by: NSong Liu <songliubraving@fb.com>
Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>

b74e21ab

01 3月, 2019 3 次提交

samples: bpf: use libbpf where easy · 1a9b268c

由 Jakub Kicinski 提交于 2月 27, 2019

Some samples don't really need the magic of bpf_load,
switch them to libbpf.

v2: - specify program types.
Signed-off-by: NJakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: NQuentin Monnet <quentin.monnet@netronome.com>
Acked-by: NAndrii Nakryiko <andriin@fb.com>
Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>

1a9b268c

samples: bpf: remove load_sock_ops in favour of bpftool · ea9b6362

由 Jakub Kicinski 提交于 2月 27, 2019

bpftool can do all the things load_sock_ops used to do, and more.
Point users to bpftool instead of maintaining this sample utility.
Signed-off-by: NJakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: NQuentin Monnet <quentin.monnet@netronome.com>
Acked-by: NAndrii Nakryiko <andriin@fb.com>
Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>

ea9b6362

samples: bpf: force IPv4 in ping · 5c3cf87d

由 Jakub Kicinski 提交于 2月 27, 2019

ping localhost may default of IPv6 on modern systems, but
samples are trying to only parse IPv4.  Force IPv4.

samples/bpf/tracex1_user.c doesn't interpret the packet so
we don't care which IP version will be used there.
Signed-off-by: NJakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: NQuentin Monnet <quentin.monnet@netronome.com>
Acked-by: NAndrii Nakryiko <andriin@fb.com>
Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>

5c3cf87d

28 2月, 2019 1 次提交

samples: bpf: fix: broken sample regarding removed function · d2e614cb

由 Daniel T. Lee 提交于 2月 27, 2019

Currently, running sample "task_fd_query" and "tracex3" occurs the
following error. On kernel v5.0-rc* this sample will be unavailable
due to the removal of function 'blk_start_request' at commit "a1ce35fa".
(function removed, as "Single Queue IO scheduler" no longer exists)

$ sudo ./task_fd_query
failed to create kprobe 'blk_start_request' error 'No such file or
directory'

This commit will change the function 'blk_start_request' to
'blk_mq_start_request' to fix the broken sample.
Signed-off-by: NDaniel T. Lee <danieltimlee@gmail.com>
Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>

d2e614cb

26 2月, 2019 1 次提交

samples/bpf: convert xdpsock to use libbpf for AF_XDP access · 248c7f9c

由 Magnus Karlsson 提交于 2月 21, 2019

This commit converts the xdpsock sample application to use the AF_XDP
functions present in libbpf. This cuts down the size of it by nearly
300 lines of code.

The default ring sizes plus the batch size has been increased and the
size of the umem area has decreased. This so that the sample application
will provide higher throughput. Note also that the shared umem code
has been removed from the sample as this is not supported by libbpf
at this point in time.
Tested-by: NBjörn Töpel <bjorn.topel@intel.com>
Signed-off-by: NMagnus Karlsson <magnus.karlsson@intel.com>
Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>

248c7f9c

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功