- 02 7月, 2017 7 次提交
-
-
由 Lawrence Brakmo 提交于
Sample BPF program that sets congestion control to dctcp when both hosts are within the same datacenter. In this example that is assumed to be when they have the first 5.5 bytes of their IPv6 address are the same. Signed-off-by: NLawrence Brakmo <brakmo@fb.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Lawrence Brakmo 提交于
This patch contains a BPF program to set initial receive window to 40 packets and send and receive buffers to 1.5MB. This would usually be done after doing appropriate checks that indicate the hosts are far enough away (i.e. large RTT). Signed-off-by: NLawrence Brakmo <brakmo@fb.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Lawrence Brakmo 提交于
Added support for calling a subset of socket setsockopts from BPF_PROG_TYPE_SOCK_OPS programs. The code was duplicated rather than making the changes to call the socket setsockopt function because the changes required would have been larger. The ops supported are: SO_RCVBUF SO_SNDBUF SO_MAX_PACING_RATE SO_PRIORITY SO_RCVLOWAT SO_MARK Signed-off-by: NLawrence Brakmo <brakmo@fb.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Lawrence Brakmo 提交于
The sample bpf program, tcp_rwnd_kern.c, sets the initial advertized window to 40 packets in an environment where distinct IPv6 prefixes indicate that both hosts are not in the same data center. Signed-off-by: NLawrence Brakmo <brakmo@fb.com> Acked-by: NDaniel Borkmann <daniel@iogearbox.net> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Lawrence Brakmo 提交于
The sample BPF program, tcp_synrto_kern.c, sets the SYN and SYN-ACK RTOs to 10ms when both hosts are within the same datacenter (i.e. small RTTs) in an environment where common IPv6 prefixes indicate both hosts are in the same data center. Signed-off-by: NLawrence Brakmo <brakmo@fb.com> Acked-by: NDaniel Borkmann <daniel@iogearbox.net> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Lawrence Brakmo 提交于
The program load_sock_ops can be used to load sock_ops bpf programs and to attach it to an existing (v2) cgroup. It can also be used to detach sock_ops programs. Examples: load_sock_ops [-l] <cg-path> <prog filename> Load and attaches a sock_ops program at the specified cgroup. If "-l" is used, the program will continue to run to output the BPF log buffer. If the specified filename does not end in ".o", it appends "_kern.o" to the name. load_sock_ops -r <cg-path> Detaches the currently attached sock_ops program from the specified cgroup. Signed-off-by: NLawrence Brakmo <brakmo@fb.com> Acked-by: NDaniel Borkmann <daniel@iogearbox.net> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Lawrence Brakmo 提交于
Created a new BPF program type, BPF_PROG_TYPE_SOCK_OPS, and a corresponding struct that allows BPF programs of this type to access some of the socket's fields (such as IP addresses, ports, etc.). It uses the existing bpf cgroups infrastructure so the programs can be attached per cgroup with full inheritance support. The program will be called at appropriate times to set relevant connections parameters such as buffer sizes, SYN and SYN-ACK RTOs, etc., based on connection information such as IP addresses, port numbers, etc. Alghough there are already 3 mechanisms to set parameters (sysctls, route metrics and setsockopts), this new mechanism provides some distinct advantages. Unlike sysctls, it can set parameters per connection. In contrast to route metrics, it can also use port numbers and information provided by a user level program. In addition, it could set parameters probabilistically for evaluation purposes (i.e. do something different on 10% of the flows and compare results with the other 90% of the flows). Also, in cases where IPv6 addresses contain geographic information, the rules to make changes based on the distance (or RTT) between the hosts are much easier than route metric rules and can be global. Finally, unlike setsockopt, it oes not require application changes and it can be updated easily at any time. Although the bpf cgroup framework already contains a sock related program type (BPF_PROG_TYPE_CGROUP_SOCK), I created the new type (BPF_PROG_TYPE_SOCK_OPS) beccause the existing type expects to be called only once during the connections's lifetime. In contrast, the new program type will be called multiple times from different places in the network stack code. For example, before sending SYN and SYN-ACKs to set an appropriate timeout, when the connection is established to set congestion control, etc. As a result it has "op" field to specify the type of operation requested. The purpose of this new program type is to simplify setting connection parameters, such as buffer sizes, TCP's SYN RTO, etc. For example, it is easy to use facebook's internal IPv6 addresses to determine if both hosts of a connection are in the same datacenter. Therefore, it is easy to write a BPF program to choose a small SYN RTO value when both hosts are in the same datacenter. This patch only contains the framework to support the new BPF program type, following patches add the functionality to set various connection parameters. This patch defines a new BPF program type: BPF_PROG_TYPE_SOCKET_OPS and a new bpf syscall command to load a new program of this type: BPF_PROG_LOAD_SOCKET_OPS. Two new corresponding structs (one for the kernel one for the user/BPF program): /* kernel version */ struct bpf_sock_ops_kern { struct sock *sk; __u32 op; union { __u32 reply; __u32 replylong[4]; }; }; /* user version * Some fields are in network byte order reflecting the sock struct * Use the bpf_ntohl helper macro in samples/bpf/bpf_endian.h to * convert them to host byte order. */ struct bpf_sock_ops { __u32 op; union { __u32 reply; __u32 replylong[4]; }; __u32 family; __u32 remote_ip4; /* In network byte order */ __u32 local_ip4; /* In network byte order */ __u32 remote_ip6[4]; /* In network byte order */ __u32 local_ip6[4]; /* In network byte order */ __u32 remote_port; /* In network byte order */ __u32 local_port; /* In host byte horder */ }; Currently there are two types of ops. The first type expects the BPF program to return a value which is then used by the caller (or a negative value to indicate the operation is not supported). The second type expects state changes to be done by the BPF program, for example through a setsockopt BPF helper function, and they ignore the return value. The reply fields of the bpf_sockt_ops struct are there in case a bpf program needs to return a value larger than an integer. Signed-off-by: NLawrence Brakmo <brakmo@fb.com> Acked-by: NDaniel Borkmann <daniel@iogearbox.net> Acked-by: NAlexei Starovoitov <ast@kernel.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 30 6月, 2017 1 次提交
-
-
由 Martin KaFai Lau 提交于
Checks are added to the existing sockex3 and test_map_in_map test. Signed-off-by: NMartin KaFai Lau <kafai@fb.com> Acked-by: NDaniel Borkmann <daniel@iogearbox.net> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 22 6月, 2017 1 次提交
-
-
由 Yonghong Song 提交于
tracex5_kern.c build failed with the following error message: ../samples/bpf/tracex5_kern.c:12:10: fatal error: 'syscall_nrs.h' file not found #include "syscall_nrs.h" The generated file syscall_nrs.h is put in build/samples/bpf directory, but this directory is not in include path, hence build failed. The fix is to add $(obj) into the clang compilation path. Signed-off-by: NYonghong Song <yhs@fb.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 15 6月, 2017 2 次提交
-
-
由 David Daney 提交于
There are two problems: 1) In MIPS the __NR_* macros expand to an expression, this causes the sections of the object file to be named like: . . . [ 5] kprobe/(5000 + 1) PROGBITS 0000000000000000 000160 ... [ 6] kprobe/(5000 + 0) PROGBITS 0000000000000000 000258 ... [ 7] kprobe/(5000 + 9) PROGBITS 0000000000000000 000348 ... . . . The fix here is to use the "asm_offsets" trick to evaluate the macros in the C compiler and generate a header file with a usable form of the macros. 2) MIPS syscall numbers start at 5000, so we need a bigger map to hold the sub-programs. Signed-off-by: NDavid Daney <david.daney@cavium.com> Acked-by: NDaniel Borkmann <daniel@iogearbox.net> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 David Daney 提交于
Signed-off-by: NDavid Daney <david.daney@cavium.com> Acked-by: NDaniel Borkmann <daniel@iogearbox.net> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 05 6月, 2017 1 次提交
-
-
由 Teng Qin 提交于
$ trace_event tests attaching BPF program to HW_CPU_CYCLES, SW_CPU_CLOCK, HW_CACHE_L1D and other events. It runs 'dd' in the background while bpf program collects user and kernel stack trace on counter overflow. User space expects to see sys_read and sys_write in the kernel stack. $ tracex6 tests reading of various perf counters from BPF program. Both tests were refactored to increase coverage and be more accurate. Signed-off-by: NTeng Qin <qinteng@fb.com> Signed-off-by: NAlexei Starovoitov <ast@kernel.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 01 6月, 2017 1 次提交
-
-
由 Jesper Dangaard Brouer 提交于
An eBPF ELF file generated with LLVM can contain several program section, which can be used for bpf tail calls. The bpf prog file descriptors are accessible via array prog_fd[]. At-least XDP samples assume ordering, and uses prog_fd[0] is the main XDP program to attach. The actual order of array prog_fd[] depend on whether or not a bpf program section is referencing any maps or not. Not using a map result in being loaded/processed after all other prog section. Thus, this can lead to some very strange and hard to debug situation, as the user can only see a FD and cannot correlated that with the ELF section name. The fix is rather simple, and even removes duplicate memcmp code. Simply load program sections as the last step, instead of load_and_attach while processing the relocation section. When working with tail calls, it become even more essential that the order of prog_fd[] is consistant, like the current dependency of the map_fd[] order. Signed-off-by: NJesper Dangaard Brouer <brouer@redhat.com> Acked-by: NAlexei Starovoitov <ast@kernel.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 12 5月, 2017 2 次提交
-
-
由 Andy Gospodarek 提交于
Shahid Habib noticed that when xdp1 was killed from a different console the xdp program was not cleaned-up properly in the kernel and it continued to forward traffic. Most of the applications in samples/bpf cleanup properly, but only when getting SIGINT. Since kill defaults to using SIGTERM, add support to cleanup when the application receives either SIGINT or SIGTERM. Signed-off-by: NAndy Gospodarek <andy@greyhouse.net> Reported-by: NShahid Habib <shahid.habib@broadcom.com> Acked-by: NAlexei Starovoitov <ast@kernel.org> Acked-by: NDaniel Borkmann <daniel@iogearbox.net> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Daniel Borkmann 提交于
After commit b5cdae32 ("net: Generic XDP") we automatically fall back to a generic XDP variant if the driver does not support native XDP. Allow for an option where the user can specify that always the native XDP variant should be selected and in case it's not supported by a driver, just bail out. Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net> Acked-by: NAlexei Starovoitov <ast@kernel.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 03 5月, 2017 4 次提交
-
-
由 Jesper Dangaard Brouer 提交于
Giving *_user.c side tools access to map_data[] provides easier access to information on the maps being loaded. Still provide the guarantee that the order maps are being defined in inside the _kern.c file corresponds with the order in the array. Now user tools are not blind, but can inspect and verify the maps that got loaded from the ELF binary. Signed-off-by: NJesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Jesper Dangaard Brouer 提交于
Do this change before others start to use this callback. Change map_perf_test_user.c which seems to be the only user. This patch extends capabilities of commit 9fd63d05 ("bpf: Allow bpf sample programs (*_user.c) to change bpf_map_def"). Give fixup callback access to struct bpf_map_data, instead of only stuct bpf_map_def. This add flexibility to allow userspace to reassign the map file descriptor. This is very useful when wanting to share maps between several bpf programs. Signed-off-by: NJesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Jesper Dangaard Brouer 提交于
This patch does proper parsing of the ELF "maps" section, in-order to be both backwards and forwards compatible with changes to the map definition struct bpf_map_def, which gets compiled into the ELF file. The assumption is that new features with value zero, means that they are not in-use. For backward compatibility where loading an ELF file with a smaller struct bpf_map_def, only copy objects ELF size, leaving rest of loaders struct zero. For forward compatibility where ELF file have a larger struct bpf_map_def, only copy loaders own struct size and verify that rest of the larger struct is zero, assuming this means the newer feature was not activated, thus it should be safe for this older loader to load this newer ELF file. Fixes: fb30d4b7 ("bpf: Add tests for map-in-map") Fixes: 409526bea3c3 ("samples/bpf: bpf_load.c detect and abort if ELF maps section size is wrong") Signed-off-by: NJesper Dangaard Brouer <brouer@redhat.com> Acked-by: NAlexei Starovoitov <ast@kernel.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Jesper Dangaard Brouer 提交于
Needed to adjust max locked memory RLIMIT_MEMLOCK for testing these bpf samples as these are using more and larger maps than can fit in distro default 64Kbytes limit. Signed-off-by: NJesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 02 5月, 2017 1 次提交
-
-
由 Daniel Borkmann 提交于
Fix the following warnings triggered by 51570a5a ("A Sample of using socket cookie and uid for traffic monitoring"): In file included from /home/foo/net-next/samples/bpf/cookie_uid_helper_example.c:54:0: /home/foo/net-next/samples/bpf/cookie_uid_helper_example.c: In function 'prog_load': /home/foo/net-next/samples/bpf/cookie_uid_helper_example.c:119:27: warning: overflow in implicit constant conversion [-Woverflow] -32 + offsetof(struct stats, uid)), ^ /home/foo/net-next/samples/bpf/libbpf.h:135:12: note: in definition of macro 'BPF_STX_MEM' .off = OFF, \ ^ /home/foo/net-next/samples/bpf/cookie_uid_helper_example.c:121:27: warning: overflow in implicit constant conversion [-Woverflow] -32 + offsetof(struct stats, packets), 1), ^ /home/foo/net-next/samples/bpf/libbpf.h:155:12: note: in definition of macro 'BPF_ST_MEM' .off = OFF, \ ^ /home/foo/net-next/samples/bpf/cookie_uid_helper_example.c:129:27: warning: overflow in implicit constant conversion [-Woverflow] -32 + offsetof(struct stats, bytes)), ^ /home/foo/net-next/samples/bpf/libbpf.h:135:12: note: in definition of macro 'BPF_STX_MEM' .off = OFF, \ ^ HOSTLD /home/foo/net-next/samples/bpf/per_socket_stats_example Fixes: 51570a5a ("A Sample of using socket cookie and uid for traffic monitoring") Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 01 5月, 2017 3 次提交
-
-
由 Jesper Dangaard Brouer 提交于
The xdp_tx_iptunnel program can be terminated in two ways, after N-seconds or via Ctrl-C SIGINT. The SIGINT code path does not handle detatching the correct XDP program, in-case the program was attached with XDP_FLAGS_SKB_MODE. Fix this by storing the XDP flags as a global variable, which is available for the SIGINT handler function. Fixes: 3993f2cb ("samples/bpf: Add support for SKB_MODE to xdp1 and xdp_tx_iptunnel") Signed-off-by: NJesper Dangaard Brouer <brouer@redhat.com> Acked-by: NDaniel Borkmann <daniel@iogearbox.net> Reviewed-by: NAndy Gospodarek <andy@greyhouse.net> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Jesper Dangaard Brouer 提交于
The kernel side of XDP_FLAGS_SKB_MODE is unsigned, and the rtnetlink IFLA_XDP_FLAGS is defined as NLA_U32. Thus, userspace programs under samples/bpf/ should use the correct type. Fixes: 3993f2cb ("samples/bpf: Add support for SKB_MODE to xdp1 and xdp_tx_iptunnel") Signed-off-by: NJesper Dangaard Brouer <brouer@redhat.com> Acked-by: NDaniel Borkmann <daniel@iogearbox.net> Reviewed-by: NAndy Gospodarek <andy@greyhouse.net> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Jesper Dangaard Brouer 提交于
The struct bpf_map_def was extended in commit fb30d4b7 ("bpf: Add tests for map-in-map") with member unsigned int inner_map_idx. This changed the size of the maps section in the generated ELF _kern.o files. Unfortunately the loader in bpf_load.c does not detect or handle this. Thus, older _kern.o files became incompatible, and caused hard-to-debug errors where the syscall validation rejected BPF_MAP_CREATE request. This patch only detect the situation and aborts load_bpf_file(). It also add code comments warning people that read this loader for inspiration for these pitfalls. Fixes: fb30d4b7 ("bpf: Add tests for map-in-map") Signed-off-by: NJesper Dangaard Brouer <brouer@redhat.com> Acked-by: NAlexei Starovoitov <ast@kernel.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 28 4月, 2017 1 次提交
-
-
由 David Ahern 提交于
Add option to xdp1 and xdp_tx_iptunnel to insert xdp program in SKB_MODE: - update set_link_xdp_fd to take a flags argument that is added to the RTM_SETLINK message - Add -S option to xdp1 and xdp_tx_iptunnel user code. When passed in XDP_FLAGS_SKB_MODE is set in the flags arg passed to set_link_xdp_fd Signed-off-by: NDavid Ahern <dsa@cumulusnetworks.com> Acked-by: NAlexei Starovoitov <ast@kernel.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 25 4月, 2017 3 次提交
-
-
由 Alexander Alemayhu 提交于
Fixes the following warning samples/bpf/test_lru_dist.c:28:0: warning: "offsetof" redefined #define offsetof(TYPE, MEMBER) ((size_t)&((TYPE *)0)->MEMBER) In file included from ./tools/lib/bpf/bpf.h:25:0, from samples/bpf/libbpf.h:5, from samples/bpf/test_lru_dist.c:24: /usr/lib/gcc/x86_64-redhat-linux/6.3.1/include/stddef.h:417:0: note: this is the location of the previous definition #define offsetof(TYPE, MEMBER) __builtin_offsetof (TYPE, MEMBER) Signed-off-by: NAlexander Alemayhu <alexander@alemayhu.com> Acked-by: NDaniel Borkmann <daniel@iogearbox.net> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Alexander Alemayhu 提交于
Fixes the following warning samples/bpf/cookie_uid_helper_example.c: At top level: samples/bpf/cookie_uid_helper_example.c:276:6: warning: no previous prototype for ‘finish’ [-Wmissing-prototypes] void finish(int ret) ^~~~~~ HOSTLD samples/bpf/per_socket_stats_example Signed-off-by: NAlexander Alemayhu <alexander@alemayhu.com> Acked-by: NDaniel Borkmann <daniel@iogearbox.net> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Alexander Alemayhu 提交于
I was initially going to remove '-Wno-address-of-packed-member' because I thought it was not supposed to be there but Daniel suggested using '-Wno-unknown-warning-option'. This silences several warnings similiar to the one below warning: unknown warning option '-Wno-address-of-packed-member' [-Wunknown-warning-option] 1 warning generated. clang -nostdinc -isystem /usr/lib/gcc/x86_64-redhat-linux/6.3.1/include -I./arch/x86/include -I./arch/x86/include/generated/uapi -I./arch/x86/include/generated -I./include -I./arch/x86/include/uapi -I./include/uapi -I./include/generated/uapi -include ./include/linux/kconfig.h \ -D__KERNEL__ -D__ASM_SYSREG_H -Wno-unused-value -Wno-pointer-sign \ -Wno-compare-distinct-pointer-types \ -Wno-gnu-variable-sized-type-not-at-end \ -Wno-address-of-packed-member -Wno-tautological-compare \ -O2 -emit-llvm -c samples/bpf/xdp_tx_iptunnel_kern.c -o -| llc -march=bpf -filetype=obj -o samples/bpf/xdp_tx_iptunnel_kern.o $ clang --version clang version 3.9.1 (tags/RELEASE_391/final) Target: x86_64-unknown-linux-gnu Thread model: posix InstalledDir: /usr/bin Signed-off-by: NAlexander Alemayhu <alexander@alemayhu.com> Acked-by: NDaniel Borkmann <daniel@iogearbox.net> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 23 4月, 2017 1 次提交
-
-
由 David S. Miller 提交于
Signed-off-by: NDavid S. Miller <davem@davemloft.net> Acked-by: NDaniel Borkmann <daniel@iogearbox.net>
-
- 18 4月, 2017 3 次提交
-
-
由 Martin KaFai Lau 提交于
This patch adds a map-in-map LRU example. If we know only a subset of cores will use the LRU, we can allocate a common LRU list per targeting core and store it into an array-of-hashs. It allows using the common LRU map with map-update performance comparable to the BPF_F_NO_COMMON_LRU map but without wasting memory on the unused cores that we know they will never access the LRU map. BPF_F_NO_COMMON_LRU: > map_perf_test 32 8 10000000 10000000 | awk '{sum += $3}END{print sum}' 9234314 (9.23M/s) map-in-map LRU: > map_perf_test 512 8 1260000 80000000 | awk '{sum += $3}END{print sum}' 9962743 (9.96M/s) Notes that the max_entries for the map-in-map LRU test is 1260000 which is the max_entries for each inner LRU map. 8 processes have been started, so 8 * 1260000 = 10080000 (~10M) which is close to what is used in the BPF_F_NO_COMMON_LRU test. Signed-off-by: NMartin KaFai Lau <kafai@fb.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Martin KaFai Lau 提交于
The current bpf_map_def is statically defined during compile time. This patch allows the *_user.c program to change it during runtime. It is done by adding load_bpf_file_fixup_map() which takes a callback. The callback will be called before creating each map so that it has a chance to modify the bpf_map_def. The current usecase is to change max_entries in map_perf_test. It is interesting to test with a much bigger map size in some cases (e.g. the following patch on bpf_lru_map.c). However, it is hard to find one size to fit all testing environment. Hence, it is handy to take the max_entries as a cmdline arg and then configure the bpf_map_def during runtime. This patch adds two cmdline args. One is to configure the map's max_entries. Another is to configure the max_cnt which controls how many times a syscall is called. Signed-off-by: NMartin KaFai Lau <kafai@fb.com> Acked-by: NAlexei Starovoitov <ast@kernel.org> Acked-by: NDaniel Borkmann <daniel@iogearbox.net> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Martin KaFai Lau 提交于
One more LRU test will be added later in this patch series. In this patch, we first move all existing LRU map tests into a single syscall (connect) first so that the future new LRU test can be added without hunting another syscall. One of the map name is also changed from percpu_lru_hash_map to nocommon_lru_hash_map to avoid the confusion with percpu_hash_map. Signed-off-by: NMartin KaFai Lau <kafai@fb.com> Acked-by: NAlexei Starovoitov <ast@kernel.org> Acked-by: NDaniel Borkmann <daniel@iogearbox.net> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 08 4月, 2017 1 次提交
-
-
由 Chenbo Feng 提交于
Added a per socket traffic monitoring option to illustrate the usage of new getsockopt SO_COOKIE. The program is based on the socket traffic monitoring program using xt_eBPF and in the new option the data entry can be directly accessed using socket cookie. The cookie retrieved allow us to lookup an element in the eBPF for a specific socket. Signed-off-by: NChenbo Feng <fengc@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 24 3月, 2017 1 次提交
-
-
由 Chenbo Feng 提交于
Add a sample program to demostrate the possible usage of get_socket_cookie and get_socket_uid helper function. The program will store bytes and packets counting of in/out traffic monitored by iptables and store the stats in a bpf map in per socket base. The owner uid of the socket will be stored as part of the data entry. A shell script for running the program is also included. Acked-by: NAlexei Starovoitov <ast@kernel.org> Acked-by: NWillem de Bruijn <willemb@google.com> Signed-off-by: NChenbo Feng <fengc@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 23 3月, 2017 1 次提交
-
-
由 Martin KaFai Lau 提交于
Test cases for array of maps and hash of maps. Signed-off-by: NMartin KaFai Lau <kafai@fb.com> Acked-by: NAlexei Starovoitov <ast@kernel.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 17 3月, 2017 1 次提交
-
-
由 Alexei Starovoitov 提交于
$ map_perf_test 128 speed of HASH bpf_map_lookup_elem() in lookups per second w/o JIT w/JIT before 46M 58M after 42M 74M perf report before: 54.23% map_perf_test [kernel.kallsyms] [k] __htab_map_lookup_elem 14.24% map_perf_test [kernel.kallsyms] [k] lookup_elem_raw 8.84% map_perf_test [kernel.kallsyms] [k] htab_map_lookup_elem 5.93% map_perf_test [kernel.kallsyms] [k] bpf_map_lookup_elem 2.30% map_perf_test [kernel.kallsyms] [k] bpf_prog_da4fc6a3f41761a2 1.49% map_perf_test [kernel.kallsyms] [k] kprobe_ftrace_handler after: 60.03% map_perf_test [kernel.kallsyms] [k] __htab_map_lookup_elem 18.07% map_perf_test [kernel.kallsyms] [k] lookup_elem_raw 2.91% map_perf_test [kernel.kallsyms] [k] bpf_prog_da4fc6a3f41761a2 1.94% map_perf_test [kernel.kallsyms] [k] _einittext 1.90% map_perf_test [kernel.kallsyms] [k] __audit_syscall_exit 1.72% map_perf_test [kernel.kallsyms] [k] kprobe_ftrace_handler Notice that bpf_map_lookup_elem() and htab_map_lookup_elem() are trivial functions, yet they take sizeable amount of cpu time. htab_map_gen_lookup() removes bpf_map_lookup_elem() and converts htab_map_lookup_elem() into three BPF insns which causing cpu time for bpf_prog_da4fc6a3f41761a2() slightly increase. $ map_perf_test 256 speed of ARRAY bpf_map_lookup_elem() in lookups per second w/o JIT w/JIT before 97M 174M after 64M 280M before: 37.33% map_perf_test [kernel.kallsyms] [k] array_map_lookup_elem 13.95% map_perf_test [kernel.kallsyms] [k] bpf_map_lookup_elem 6.54% map_perf_test [kernel.kallsyms] [k] bpf_prog_da4fc6a3f41761a2 4.57% map_perf_test [kernel.kallsyms] [k] kprobe_ftrace_handler after: 32.86% map_perf_test [kernel.kallsyms] [k] bpf_prog_da4fc6a3f41761a2 6.54% map_perf_test [kernel.kallsyms] [k] kprobe_ftrace_handler array_map_gen_lookup() removes calls to array_map_lookup_elem() and bpf_map_lookup_elem() and replaces them with 7 bpf insns. The performance without JIT is slower, since executing extra insns in the interpreter is slower than running native C code, but with JIT the performance gains are obvious, since native C->x86 code is replaced with fewer bpf->x86 instructions. Signed-off-by: NAlexei Starovoitov <ast@kernel.org> Acked-by: NDaniel Borkmann <daniel@iogearbox.net> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 14 2月, 2017 3 次提交
-
-
由 Mickaël Salaün 提交于
Before loading a new ELF, clean previous kernel version, license and processed sections. Signed-off-by: NMickaël Salaün <mic@digikod.net> Acked-by: NJoe Stringer <joe@ovn.org> Acked-by: NWang Nan <wangnan0@huawei.com> Cc: Alexei Starovoitov <ast@fb.com> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: David S. Miller <davem@davemloft.net> Cc: netdev@vger.kernel.org Link: http://lkml.kernel.org/r/20170208202744.16274-3-mic@digikod.netSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Mickaël Salaün 提交于
Add a missing check for the map fixup loop. Signed-off-by: NMickaël Salaün <mic@digikod.net> Acked-by: NJoe Stringer <joe@ovn.org> Acked-by: NWang Nan <wangnan0@huawei.com> Cc: Alexei Starovoitov <ast@fb.com> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: David S. Miller <davem@davemloft.net> Cc: netdev@vger.kernel.org Link: http://lkml.kernel.org/r/20170208202744.16274-2-mic@digikod.netSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Mickaël Salaün 提交于
Include unistd.h to define __NR_getuid and __NR_getsid. Signed-off-by: NMickaël Salaün <mic@digikod.net> Acked-by: NJoe Stringer <joe@ovn.org> Acked-by: NWang Nan <wangnan0@huawei.com> Cc: Alexei Starovoitov <ast@fb.com> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: David S. Miller <davem@davemloft.net> Cc: netdev@vger.kernel.org Link: http://lkml.kernel.org/r/20170208202744.16274-4-mic@digikod.netSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 13 2月, 2017 1 次提交
-
-
由 Alexei Starovoitov 提交于
If BPF_F_ALLOW_OVERRIDE flag is used in BPF_PROG_ATTACH command to the given cgroup the descendent cgroup will be able to override effective bpf program that was inherited from this cgroup. By default it's not passed, therefore override is disallowed. Examples: 1. prog X attached to /A with default prog Y fails to attach to /A/B and /A/B/C Everything under /A runs prog X 2. prog X attached to /A with allow_override. prog Y fails to attach to /A/B with default (non-override) prog M attached to /A/B with allow_override. Everything under /A/B runs prog M only. 3. prog X attached to /A with allow_override. prog Y fails to attach to /A with default. The user has to detach first to switch the mode. In the future this behavior may be extended with a chain of non-overridable programs. Also fix the bug where detach from cgroup where nothing is attached was not throwing error. Return ENOENT in such case. Add several testcases and adjust libbpf. Fixes: 30070984 ("cgroup: add support for eBPF programs") Signed-off-by: NAlexei Starovoitov <ast@kernel.org> Acked-by: NDaniel Borkmann <daniel@iogearbox.net> Acked-by: NTejun Heo <tj@kernel.org> Acked-by: NDaniel Mack <daniel@zonque.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 24 1月, 2017 1 次提交
-
-
由 David Herrmann 提交于
Extend the map_perf_test_{user,kern}.c infrastructure to stress test lpm-trie lookups. We hook into the kprobe on sys_gettid() and measure the latency depending on trie size and lookup count. On my Intel Haswell i7-6400U, a single gettid() syscall with an empty bpf program takes roughly 6.5us on my system. Lookups in empty tries take ~1.8us on first try, ~0.9us on retries. Lookups in tries with 8192 entries take ~7.1us (on the first _and_ any subsequent try). Signed-off-by: NDavid Herrmann <dh.herrmann@gmail.com> Reviewed-by: NDaniel Mack <daniel@zonque.org> Acked-by: NAlexei Starovoitov <ast@kernel.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-