提交 · 1772be37fc9d2775ae48c8b3bc34f1899effb7b3 · openeuler / Kernel

15 5月, 2018 3 次提交

Merge branch 'bpf-stackmap-nmi' · 1772be37

由 Daniel Borkmann 提交于 5月 14, 2018

Song Liu says:
====================
Changes v2 -> v3:
  Improve syntax based on suggestion by Tobin C. Harding.

Changes v1 -> v2:
  1. Rename some variables to (hopefully) reduce confusion;
  2. Check irq_work status with IRQ_WORK_BUSY (instead of work->sem);
  3. In Kconfig, let BPF_SYSCALL select IRQ_WORK;
  4. Add static to DEFINE_PER_CPU();
   5. Remove pr_info() in stack_map_init().
====================
Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>

1772be37

bpf: add selftest for stackmap with build_id in NMI context · 13790d1c

由 Song Liu 提交于 5月 07, 2018

This new test captures stackmap with build_id with hardware event
PERF_COUNT_HW_CPU_CYCLES.

Because we only support one ips-to-build_id lookup per cpu in NMI
context, stack_amap will not be able to do the lookup in this test.
Therefore, we didn't do compare_stack_ips(), as it will alwasy fail.

urandom_read.c is extended to run configurable cycles so that it can be
caught by the perf event.
Signed-off-by: NSong Liu <songliubraving@fb.com>
Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>

13790d1c

bpf: enable stackmap with build_id in nmi context · bae77c5e

由 Song Liu 提交于 5月 07, 2018

Currently, we cannot parse build_id in nmi context because of
up_read(&current->mm->mmap_sem), this makes stackmap with build_id
less useful. This patch enables parsing build_id in nmi by putting
the up_read() call in irq_work. To avoid memory allocation in nmi
context, we use per cpu variable for the irq_work. As a result, only
one irq_work per cpu is allowed. If the irq_work is in-use, we
fallback to only report ips.

Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: NSong Liu <songliubraving@fb.com>
Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>

bae77c5e

11 5月, 2018 21 次提交

Merge branch 'bpf-perf-rb-libbpf' · a84880ef

由 Daniel Borkmann 提交于 5月 11, 2018

Jakub Kicinski says:

====================
This series started out as a follow up to the bpftool perf event dumping
patches.

As suggested by Daniel patch 1 makes use of PERF_SAMPLE_TIME to simplify
code and improve accuracy of timestamps.

Remaining patches are trying to move perf event loop into libbpf as
suggested by Alexei.  One user for this new function is bpftool which
links with libbpf nicely, the other, unfortunately, is in samples/bpf.
Remaining patches make samples/bpf link against full libbpf.a (not just
a handful of objects).  Once we have full power of libbpf at our disposal
we can convert some of XDP samples to use libbpf loader instead of
bpf_load.c.  My understanding is that this is the desired direction,
at least for networking code.
====================
Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>

a84880ef

samples: bpf: convert some XDP samples from bpf_load to libbpf · be5bca44

由 Jakub Kicinski 提交于 5月 10, 2018

Now that we can use full powers of libbpf in BPF samples, we
should perhaps make the simplest XDP programs not depend on
bpf_load helpers.  This way newcomers will be exposed to the
recommended library from the start.

Use of bpf_prog_load_xattr() will also make it trivial to later
on request offload of the programs by simply adding ifindex to
the xattr.
Signed-off-by: NJakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: NQuentin Monnet <quentin.monnet@netronome.com>
Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>

be5bca44

tools: bpf: don't complain about no kernel version for networking code · 17387dd5

由 Jakub Kicinski 提交于 5月 10, 2018

BPF programs only have to specify the target kernel version for
tracing related hooks, in networking world that requirement does
not really apply. Loosen the checks in libbpf to reflect that.

bpf_object__open() users will continue to see the error for backward
compatibility (and because prog_type is not available there).

Error code for NULL file name is changed from ENOENT to EINVAL,
as it seems more appropriate, hopefully, that's an OK change.
Signed-off-by: NJakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: NQuentin Monnet <quentin.monnet@netronome.com>
Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>

17387dd5

tools: bpf: improve comments in libbpf.h · 2eb57bb8

由 Jakub Kicinski 提交于 5月 10, 2018

Fix spelling mistakes, improve and clarify the language of comments
in libbpf.h.
Signed-off-by: NJakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: NQuentin Monnet <quentin.monnet@netronome.com>
Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>

2eb57bb8

tools: bpf: move the event reading loop to libbpf · d0cabbb0

由 Jakub Kicinski 提交于 5月 10, 2018

There are two copies of event reading loop - in bpftool and
trace_helpers "library".  Consolidate them and move the code
to libbpf.  Return codes from trace_helpers are kept, but
renamed to include LIBBPF prefix.
Suggested-by: NAlexei Starovoitov <ast@kernel.org>
Signed-off-by: NJakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: NQuentin Monnet <quentin.monnet@netronome.com>
Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>

d0cabbb0

samples: bpf: compile and link against full libbpf · 5f938057

由 Jakub Kicinski 提交于 5月 10, 2018

samples/bpf currently cherry-picks object files from tools/lib/bpf
to link against.  Just compile the full library and link statically
against it.
Signed-off-by: NJakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: NQuentin Monnet <quentin.monnet@netronome.com>
Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>

5f938057

samples: bpf: rename struct bpf_map_def to avoid conflict with libbpf · 74662ea5

由 Jakub Kicinski 提交于 5月 10, 2018

Both tools/lib/bpf/libbpf.h and samples/bpf/bpf_load.h define their
own version of struct bpf_map_def. The version in bpf_load.h has
more fields. libbpf does not support inner maps and its definition
of struct bpf_map_def lacks the related fields. Rename the definition
in bpf_load.h (samples/bpf) to avoid conflicts.
Signed-off-by: NJakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: NQuentin Monnet <quentin.monnet@netronome.com>
Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>

74662ea5

tools: bpftool: use PERF_SAMPLE_TIME instead of reading the clock · e3687510

由 Jakub Kicinski 提交于 5月 10, 2018

Ask the kernel to include sample time in each even instead of
reading the clock.  This is also more accurate because our
clock reading was done when user space would dump the buffer,
not when sample was produced.
Suggested-by: NDaniel Borkmann <daniel@iogearbox.net>
Signed-off-by: NJakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: NQuentin Monnet <quentin.monnet@netronome.com>
Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>

e3687510

bpf: sync tools bpf.h uapi header · cb9c28ef

由 Prashant Bhole 提交于 5月 09, 2018

Sync the header from include/uapi/linux/bpf.h which was updated to add
fib lookup helper function. This fixes selftests/bpf build failure.
Signed-off-by: NPrashant Bhole <bhole_prashant_q7@lab.ntt.co.jp>
Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>

cb9c28ef

selftests/bpf: Fix bash reference in Makefile · 91bc07c9

由 Joe Stringer 提交于 5月 10, 2018

'|& ...' is a bash 4.0+ construct which is not guaranteed to be available
when using '$(shell ...)' in a Makefile. Fall back to the more portable
'2>&1 | ...'.

Fixes the following warning during compilation:

	/bin/sh: 1: Syntax error: "&" unexpected
Signed-off-by: NJoe Stringer <joe@wand.net.nz>
Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>

91bc07c9

Merge branch 'bpf-fib-lookup-helper' · ff1f56d9

由 Daniel Borkmann 提交于 5月 11, 2018

David Ahern says:

====================
Provide a helper for doing a FIB and neighbor lookup in the kernel
tables from an XDP program. The helper provides a fastpath for forwarding
packets. If the packet is a local delivery or for any reason is not a
simple lookup and forward, the packet is expected to continue up the stack
for full processing.

The response from a FIB and neighbor lookup is either the egress index
with the bpf_fib_lookup struct filled in with dmac and gateway or
0 meaning the packet should continue up the stack. In time we can
revisit this to return the FIB lookup result errno if it is one of the
special RTN_'s such as RTN_BLACKHOLE (-EINVAL) so that the XDP
programs can do an early drop if desired.

Patches 1-6 do some more refactoring to IPv6 with the end goal of
extracting a FIB lookup function that aligns with fib_lookup for IPv4,
basically returning a fib6_info without creating a dst based entry.

Patch 7 adds lookup functions to the ipv6 stub. These are needed since
bpf is built into the kernel and ipv6 may not be built or loaded.

Patch 8 adds the bpf helper and 9 adds a sample program.

v3
- remove ETH_ALEN and in6_addr from uapi header

v2
- removed pkt_access from bpf_func_proto as noticed by Daniel
- added check in that IPv6 forwarding is enabled
- added DaveM's ack on patches 1-7 and 9 based on v1 response and
  fact that no changes were made to them in v2

v1
- updated commit messages and cover letter
- added comment to sample program noting lack of verification on
  egress device supporting XDP

RFC v2
- fixed use of foward helper from cls_act as noted by Daniel
- in patch 1 rename fib6_lookup_1 as well for consistency
====================
Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>

ff1f56d9

samples/bpf: Add example of ipv4 and ipv6 forwarding in XDP · fe616055

由 David Ahern 提交于 5月 09, 2018

Simple example of fast-path forwarding. It has a serious flaw
in not verifying the egress device index supports XDP forwarding.
If the egress device does not packets are dropped.

Take this only as a simple example of fast-path forwarding.
Signed-off-by: NDavid Ahern <dsahern@gmail.com>
Acked-by: NDavid S. Miller <davem@davemloft.net>
Acked-by: NJesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>

fe616055

bpf: Provide helper to do forwarding lookups in kernel FIB table · 87f5fc7e

由 David Ahern 提交于 5月 09, 2018

Provide a helper for doing a FIB and neighbor lookup in the kernel
tables from an XDP program. The helper provides a fastpath for forwarding
packets. If the packet is a local delivery or for any reason is not a
simple lookup and forward, the packet continues up the stack.

If it is to be forwarded, the forwarding can be done directly if the
neighbor is already known. If the neighbor does not exist, the first
few packets go up the stack for neighbor resolution. Once resolved, the
xdp program provides the fast path.

On successful lookup the nexthop dmac, current device smac and egress
device index are returned.

The API supports IPv4, IPv6 and MPLS protocols, but only IPv4 and IPv6
are implemented in this patch. The API includes layer 4 parameters if
the XDP program chooses to do deep packet inspection to allow compare
against ACLs implemented as FIB rules.

Header rewrite is left to the XDP program.

The lookup takes 2 flags:
- BPF_FIB_LOOKUP_DIRECT to do a lookup that bypasses FIB rules and goes
  straight to the table associated with the device (expert setting for
  those looking to maximize throughput)

- BPF_FIB_LOOKUP_OUTPUT to do a lookup from the egress perspective.
  Default is an ingress lookup.

Initial performance numbers collected by Jesper, forwarded packets/sec:

       Full stack    XDP FIB lookup    XDP Direct lookup
IPv4   1,947,969       7,074,156          7,415,333
IPv6   1,728,000       6,165,504          7,262,720

These number are single CPU core forwarding on a Broadwell
E5-1650 v4 @ 3.60GHz.
Signed-off-by: NDavid Ahern <dsahern@gmail.com>
Acked-by: NJesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>

87f5fc7e

net/ipv6: Add fib lookup stubs for use in bpf helper · 65a2022e

由 David Ahern 提交于 5月 09, 2018

Add stubs to retrieve a handle to an IPv6 FIB table, fib6_get_table,
a stub to do a lookup in a specific table, fib6_table_lookup, and
a stub for a full route lookup.

The stubs are needed for core bpf code to handle the case when the
IPv6 module is not builtin.
Signed-off-by: NDavid Ahern <dsahern@gmail.com>
Acked-by: NDavid S. Miller <davem@davemloft.net>
Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>

65a2022e

net/ipv6: Update fib6 tracepoint to take fib6_info · d4bea421

由 David Ahern 提交于 5月 09, 2018

Similar to IPv4, IPv6 should use the FIB lookup result in the
tracepoint.
Signed-off-by: NDavid Ahern <dsahern@gmail.com>
Acked-by: NDavid S. Miller <davem@davemloft.net>
Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>

d4bea421

net/ipv6: Add fib6_lookup · 138118ec

由 David Ahern 提交于 5月 09, 2018

Add IPv6 equivalent to fib_lookup. Does a fib lookup, including rules,
but returns a FIB entry, fib6_info, rather than a dst based rt6_info.
fib6_lookup is any where from 140% (MULTIPLE_TABLES config disabled)
to 60% faster than any of the dst based lookup methods (without custom
rules) and 25% faster with custom rules (e.g., l3mdev rule).

Since the lookup function has a completely different signature,
fib6_rule_action is split into 2 paths: the existing one is
renamed __fib6_rule_action and a new one for the fib6_info path
is added. fib6_rule_action decides which to call based on the
lookup_ptr. If it is fib6_table_lookup then the new path is taken.

Caller must hold rcu lock as no reference is taken on the returned
fib entry.
Signed-off-by: NDavid Ahern <dsahern@gmail.com>
Acked-by: NDavid S. Miller <davem@davemloft.net>
Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>

138118ec

net/ipv6: Refactor fib6_rule_action · cc065a9e

由 David Ahern 提交于 5月 09, 2018

Move source address lookup from fib6_rule_action to a helper. It will be
used in a later patch by a second variant for fib6_rule_action.
Signed-off-by: NDavid Ahern <dsahern@gmail.com>
Acked-by: NDavid S. Miller <davem@davemloft.net>
Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>

cc065a9e

net/ipv6: Extract table lookup from ip6_pol_route · 1d053da9

由 David Ahern 提交于 5月 09, 2018

ip6_pol_route is used for ingress and egress FIB lookups. Refactor it
moving the table lookup into a separate fib6_table_lookup that can be
invoked separately and export the new function.

ip6_pol_route now calls fib6_table_lookup and uses the result to generate
a dst based rt6_info.
Signed-off-by: NDavid Ahern <dsahern@gmail.com>
Acked-by: NDavid S. Miller <davem@davemloft.net>
Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>

1d053da9

net/ipv6: Rename rt6_multipath_select · 3b290a31

由 David Ahern 提交于 5月 09, 2018

Rename rt6_multipath_select to fib6_multipath_select and export it.
A later patch wants access to it similar to IPv4's fib_select_path.
Signed-off-by: NDavid Ahern <dsahern@gmail.com>
Acked-by: NDavid S. Miller <davem@davemloft.net>
Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>

3b290a31

net/ipv6: Rename fib6_lookup to fib6_node_lookup · 6454743b

由 David Ahern 提交于 5月 09, 2018

Rename fib6_lookup to fib6_node_lookup to better reflect what it
returns. The fib6_lookup name will be used in a later patch for
an IPv6 equivalent to IPv4's fib_lookup.
Signed-off-by: NDavid Ahern <dsahern@gmail.com>
Acked-by: NDavid S. Miller <davem@davemloft.net>
Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>

6454743b

bpf, doc: clarification for the meaning of 'id' · 68625b76

由 Wang YanQing 提交于 5月 10, 2018

For me, as a reader whose mother language isn't English, the
old words bring a little difficulty to catch the meaning, this
patch rewords the subsection in a more clarificatory way.

This patch also add blank lines as separator at two places
to improve readability.
Signed-off-by: NWang YanQing <udknight@gmail.com>
Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>

68625b76

10 5月, 2018 6 次提交

selftests/bpf: ignore build products · 96112e93

由 Sirio Balmelli 提交于 5月 08, 2018

Update .gitignore files.
Signed-off-by: NSirio Balmelli <sirio@b-ad.ch>
Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>

96112e93

selftests/bpf: add architecture-agnostic headers · cd65cd95

由 Sirio Balmelli 提交于 5月 08, 2018

The BPF selftests fail to build with missing headers
'asm/bitsperlong.h' and 'asm/errno.h'.

These already exist in 'tools/arch/[arch]/include';
add architecture-agnostic header files in 'tools/include/uapi'
to reference them.
Signed-off-by: NSirio Balmelli <sirio@b-ad.ch>
Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>

cd65cd95

xsk: fix 64-bit division · ea7e3435

由 Björn Töpel 提交于 5月 07, 2018

i386 builds report:
  net/xdp/xdp_umem.o: In function `xdp_umem_reg':
  xdp_umem.c:(.text+0x47e): undefined reference to `__udivdi3'

This fix uses div_u64 instead of the GCC built-in.

Fixes: c0c77d8f ("xsk: add user memory registration support sockopt")
Signed-off-by: NBjörn Töpel <bjorn.topel@intel.com>
Reported-by: NRandy Dunlap <rdunlap@infradead.org>
Tested-by: NRandy Dunlap <rdunlap@infradead.org>
Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>

ea7e3435

Merge branch 'bpf-nfp-programmable-rss' · a46a5c1a

由 Daniel Borkmann 提交于 5月 09, 2018

Jakub Kicinski says:

====================
This small series adds a feature which extends BPF offload beyond
a pure host processing offload and firmly into the realm of
heterogeneous processing.  Allowing offloaded XDP programs to set
the RX queue index opens the door for defining fully programmable
RSS/n-tuple filter replacement.  In fact the device datapath will
skip the RSS processing completely if BPF decided on the queue
already, making the XDP program replace part of the standard NIC
datapath.

We hope some day the entire NIC datapath will be defined by BPF :)
====================
Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
Acked-by: NAlexei Starovoitov <ast@kernel.org>

a46a5c1a

nfp: bpf: support setting the RX queue index · d985888f

由 Jakub Kicinski 提交于 5月 08, 2018

BPF has access to all internal FW datapath structures. Including
the structure containing RX queue selection. With little coordination
with the datapath we can let the offloaded BPF select the RX queue.
We just need a way to tell the datapath that queue selection has already
been done and it shouldn't overwrite it. Define a bit to tell datapath
BPF already selected a queue (QSEL_SET), if the selected queue is not
enabled (>= number of enabled queues) datapath will perform normal RSS.

BPF queue selection on the NIC can be used to replace standard
datapath RSS with fully programmable BPF/XDP RSS.
Signed-off-by: NJakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: NQuentin Monnet <quentin.monnet@netronome.com>
Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>

d985888f

bpf: xdp: allow offloads to store into rx_queue_index · 0d830032

由 Jakub Kicinski 提交于 5月 08, 2018

It's fairly easy for offloaded XDP programs to select the RX queue
packets go to.  We need a way of expressing this in the software.
Allow write to the rx_queue_index field of struct xdp_md for
device-bound programs.

Skip convert_ctx_access callback entirely for offloads.
Signed-off-by: NJakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: NQuentin Monnet <quentin.monnet@netronome.com>
Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>

0d830032

09 5月, 2018 9 次提交

Merge branch 'bpf-btf-id' · a1d1f079

由 Daniel Borkmann 提交于 5月 09, 2018

Martin KaFai Lau says:

====================
This series introduces BTF ID which is exposed through
the new BPF_BTF_GET_FD_BY_ID cmd, new "struct bpf_btf_info"
and new members in the "struct bpf_map_info".

Please see individual patch for details.
====================
Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>

a1d1f079

bpf: btf: Tests for BPF_OBJ_GET_INFO_BY_FD and BPF_BTF_GET_FD_BY_ID · cd8b8928

由 Martin KaFai Lau 提交于 5月 04, 2018

This patch adds test for BPF_BTF_GET_FD_BY_ID and the new
btf_id/btf_key_id/btf_value_id in the "struct bpf_map_info".

It also modifies the existing BPF_OBJ_GET_INFO_BY_FD test
to reflect the new "struct bpf_btf_info".
Signed-off-by: NMartin KaFai Lau <kafai@fb.com>
Acked-by: NAlexei Starovoitov <ast@fb.com>
Acked-by: NSong Liu <songliubraving@fb.com>
Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>

cd8b8928

bpf: btf: Update tools/include/uapi/linux/btf.h with BTF ID · 7a01f6a3

由 Martin KaFai Lau 提交于 5月 04, 2018

This patch sync the tools/include/uapi/linux/btf.h with
the newly introduced BTF ID support.
Signed-off-by: NMartin KaFai Lau <kafai@fb.com>
Acked-by: NAlexei Starovoitov <ast@fb.com>
Acked-by: NSong Liu <songliubraving@fb.com>
Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>

7a01f6a3

bpf: btf: Some test_btf clean up · e34d98d3

由 Martin KaFai Lau 提交于 5月 04, 2018

This patch adds a CHECK() macro for condition checking
and error report purpose.  Something similar to test_progs.c

It also counts the number of tests passed/skipped/failed and
print them at the end of the test run.
Signed-off-by: NMartin KaFai Lau <kafai@fb.com>
Acked-by: NAlexei Starovoitov <ast@fb.com>
Acked-by: NSong Liu <songliubraving@fb.com>
Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>

e34d98d3

bpf: btf: Add struct bpf_btf_info · 62dab84c

由 Martin KaFai Lau 提交于 5月 04, 2018

During BPF_OBJ_GET_INFO_BY_FD on a btf_fd, the current bpf_attr's
info.info is directly filled with the BTF binary data.  It is
not extensible.  In this case, we want to add BTF ID.

This patch adds "struct bpf_btf_info" which has the BTF ID as
one of its member.  The BTF binary data itself is exposed through
the "btf" and "btf_size" members.
Signed-off-by: NMartin KaFai Lau <kafai@fb.com>
Acked-by: NAlexei Starovoitov <ast@fb.com>
Acked-by: NSong Liu <songliubraving@fb.com>
Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>

62dab84c

bpf: btf: Introduce BTF ID · 78958fca

由 Martin KaFai Lau 提交于 5月 04, 2018

This patch gives an ID to each loaded BTF.  The ID is allocated by
the idr like the existing prog-id and map-id.

The bpf_put(map->btf) is moved to __bpf_map_put() so that the
userspace can stop seeing the BTF ID ASAP when the last BTF
refcnt is gone.

It also makes BTF accessible from userspace through the
1. new BPF_BTF_GET_FD_BY_ID command.  It is limited to CAP_SYS_ADMIN
   which is inline with the BPF_BTF_LOAD cmd and the existing
   BPF_[MAP|PROG]_GET_FD_BY_ID cmd.
2. new btf_id (and btf_key_id + btf_value_id) in "struct bpf_map_info"

Once the BTF ID handler is accessible from userspace, freeing a BTF
object has to go through a rcu period.  The BPF_BTF_GET_FD_BY_ID cmd
can then be done under a rcu_read_lock() instead of taking
spin_lock.
[Note: A similar rcu usage can be done to the existing
       bpf_prog_get_fd_by_id() in a follow up patch]

When processing the BPF_BTF_GET_FD_BY_ID cmd,
refcount_inc_not_zero() is needed because the BTF object
could be already in the rcu dead row .  btf_get() is
removed since its usage is currently limited to btf.c
alone.  refcount_inc() is used directly instead.
Signed-off-by: NMartin KaFai Lau <kafai@fb.com>
Acked-by: NAlexei Starovoitov <ast@fb.com>
Acked-by: NSong Liu <songliubraving@fb.com>
Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>

78958fca

bpf: btf: Avoid WARN_ON when CONFIG_REFCOUNT_FULL=y · 82e96972

由 Martin KaFai Lau 提交于 5月 04, 2018

If CONFIG_REFCOUNT_FULL=y, refcount_inc() WARN when refcount is 0.
When creating a new btf, the initial btf->refcnt is 0 and
triggered the following:

[   34.855452] refcount_t: increment on 0; use-after-free.
[   34.856252] WARNING: CPU: 6 PID: 1857 at lib/refcount.c:153 refcount_inc+0x26/0x30
....
[   34.868809] Call Trace:
[   34.869168]  btf_new_fd+0x1af6/0x24d0
[   34.869645]  ? btf_type_seq_show+0x200/0x200
[   34.870212]  ? lock_acquire+0x3b0/0x3b0
[   34.870726]  ? security_capable+0x54/0x90
[   34.871247]  __x64_sys_bpf+0x1b2/0x310
[   34.871761]  ? __ia32_sys_bpf+0x310/0x310
[   34.872285]  ? bad_area_access_error+0x310/0x310
[   34.872894]  do_syscall_64+0x95/0x3f0

This patch uses refcount_set() instead.
Reported-by: NYonghong Song <yhs@fb.com>
Tested-by: NYonghong Song <yhs@fb.com>
Signed-off-by: NMartin KaFai Lau <kafai@fb.com>
Acked-by: NSong Liu <songliubraving@fb.com>
Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>

82e96972

dt-bindings: dsa: Remove unnecessary #address/#size-cells · 53a7bdfb

由 Fabio Estevam 提交于 5月 07, 2018

If the example binding is used on a real dts file, the following DTC
warning is seen with W=1:

arch/arm/boot/dts/imx6q-b450v3.dtb: Warning (avoid_unnecessary_addr_size): /mdio-gpio/switch@0: unnecessary #address-cells/#size-cells without "ranges" or child "reg" property

Remove unnecessary #address-cells/#size-cells to improve the binding
document examples.
Signed-off-by: NFabio Estevam <fabio.estevam@nxp.com>
Reviewed-by: NRob Herring <robh@kernel.org>
Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

53a7bdfb

net: phy: sfp: handle cases where neither BR, min nor BR, max is given · 2b999ba8

由 Antoine Tenart 提交于 5月 04, 2018

When computing the bitrate using values read from an SFP module EEPROM,
we use the nominal BR plus BR,min and BR,max to determine the
boundaries. But in some cases BR,min and BR,max aren't provided, which
led the SFP code to end up having the nominal value for both the minimum
and maximum bitrate values. When using a passive cable, the nominal
value should be used as the maximum one, and there is no minimum one
so we should use 0.
Signed-off-by: NAntoine Tenart <antoine.tenart@bootlin.com>
Acked-by: NRussell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

2b999ba8

08 5月, 2018 1 次提交

Merge branch 'bnxt_en-Fixes-for-net-next' · 8d42eada

由 David S. Miller 提交于 5月 08, 2018

Michael Chan says:

====================
bnxt_en: Fixes for net-next.

This series includes a bug fix for a regression in firmware message polling
introduced recently on net-next.  There are 3 additional minor fixes for
unsupported link speed checking, VF MAC address handling, and setting
PHY eeprom length.
====================
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

8d42eada

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功