提交 · c0a5a21c25f37c9fd7b36072f9968cdff1e4aa13 · openeuler / Kernel

26 4月, 2022 1 次提交

bpf: Allow storing referenced kptr in map · c0a5a21c

由 Kumar Kartikeya Dwivedi 提交于 4月 25, 2022

Extending the code in previous commits, introduce referenced kptr
support, which needs to be tagged using 'kptr_ref' tag instead. Unlike
unreferenced kptr, referenced kptr have a lot more restrictions. In
addition to the type matching, only a newly introduced bpf_kptr_xchg
helper is allowed to modify the map value at that offset. This transfers
the referenced pointer being stored into the map, releasing the
references state for the program, and returning the old value and
creating new reference state for the returned pointer.

Similar to unreferenced pointer case, return value for this case will
also be PTR_TO_BTF_ID_OR_NULL. The reference for the returned pointer
must either be eventually released by calling the corresponding release
function, otherwise it must be transferred into another map.

It is also allowed to call bpf_kptr_xchg with a NULL pointer, to clear
the value, and obtain the old value if any.

BPF_LDX, BPF_STX, and BPF_ST cannot access referenced kptr. A future
commit will permit using BPF_LDX for such pointers, but attempt at
making it safe, since the lifetime of object won't be guaranteed.

There are valid reasons to enforce the restriction of permitting only
bpf_kptr_xchg to operate on referenced kptr. The pointer value must be
consistent in face of concurrent modification, and any prior values
contained in the map must also be released before a new one is moved
into the map. To ensure proper transfer of this ownership, bpf_kptr_xchg
returns the old value, which the verifier would require the user to
either free or move into another map, and releases the reference held
for the pointer being moved in.

In the future, direct BPF_XCHG instruction may also be permitted to work
like bpf_kptr_xchg helper.

Note that process_kptr_func doesn't have to call
check_helper_mem_access, since we already disallow rdonly/wronly flags
for map, which is what check_map_access_type checks, and we already
ensure the PTR_TO_MAP_VALUE refers to kptr by obtaining its off_desc,
so check_map_access is also not required.
Signed-off-by: NKumar Kartikeya Dwivedi <memxor@gmail.com>
Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20220424214901.2743946-4-memxor@gmail.com

c0a5a21c

08 4月, 2022 1 次提交

net-core: rx_otherhost_dropped to core_stats · 794c24e9

由 Jeffrey Ji 提交于 4月 06, 2022

Increment rx_otherhost_dropped counter when packet dropped due to
mismatched dest MAC addr.

An example when this drop can occur is when manually crafting raw
packets that will be consumed by a user space application via a tap
device. For testing purposes local traffic was generated using trafgen
for the client and netcat to start a server

Tested: Created 2 netns, sent 1 packet using trafgen from 1 to the other
with "{eth(daddr=$INCORRECT_MAC...}", verified that iproute2 showed the
counter was incremented. (Also had to modify iproute2 to show the stat,
additional patch for that coming next.)
Signed-off-by: NJeffrey Ji <jeffreyji@google.com>
Reviewed-by: NBrian Vazquez <brianvv@google.com>
Reviewed-by: NEric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/r/20220406172600.1141083-1-jeffreyjilinux@gmail.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>

794c24e9

06 4月, 2022 1 次提交

net, uapi: remove inclusion of arpa/inet.h · 1ee375d7

由 Nick Desaulniers 提交于 4月 04, 2022

In include/uapi/linux/tipc_config.h, there's a comment that it includes
arpa/inet.h for ntohs; but ntohs is not defined in any UAPI header. For
now, reuse the definitions from include/linux/byteorder/generic.h, since
the various conversion functions do exist in UAPI headers:
include/uapi/linux/byteorder/big_endian.h
include/uapi/linux/byteorder/little_endian.h

We would like to get to the point where we can build UAPI header tests
with -nostdinc, meaning that kernel UAPI headers should not have a
circular dependency on libc headers.

Link: https://android-review.googlesource.com/c/platform/bionic/+/2048127Suggested-by: NJakub Kicinski <kuba@kernel.org>
Signed-off-by: NNick Desaulniers <ndesaulniers@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

1ee375d7

04 4月, 2022 1 次提交

bpf: Correct the comment for BTF kind bitfield · 66df0fdb

由 Haiyue Wang 提交于 4月 03, 2022

The commit 8fd88691 ("bpf: Add BTF_KIND_FLOAT to uapi") has extended
the BTF kind bitfield from 4 to 5 bits, correct the comment.
Signed-off-by: NHaiyue Wang <haiyue.wang@intel.com>
Signed-off-by: NAndrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220403115327.205964-1-haiyue.wang@intel.com

66df0fdb

03 4月, 2022 1 次提交

tracing: mark user_events as BROKEN · 1cd927ad

由 Steven Rostedt (Google) 提交于 4月 01, 2022

After being merged, user_events become more visible to a wider audience
that have concerns with the current API.

It is too late to fix this for this release, but instead of a full
revert, just mark it as BROKEN (which prevents it from being selected in
make config). Then we can work finding a better API. If that fails,
then it will need to be completely reverted.

To not have the code silently bitrot, still allow building it with
COMPILE_TEST.

And to prevent the uapi header from being installed, then later changed,
and then have an old distro user space see the old version, move the
header file out of the uapi directory.

Surround the include with CONFIG_COMPILE_TEST to the current location,
but when the BROKEN tag is taken off, it will use the uapi directory,
and fail to compile. This is a good way to remind us to move the header
back.

Link: https://lore.kernel.org/all/20220330155835.5e1f6669@gandalf.local.home
Link: https://lkml.kernel.org/r/20220330201755.29319-1-mathieu.desnoyers@efficios.comSuggested-by: NMathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: NSteven Rostedt (Google) <rostedt@goodmis.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

1cd927ad

02 4月, 2022 2 次提交

tracing: Move user_events.h temporarily out of include/uapi · 5cfff569

由 Steven Rostedt (Google) 提交于 4月 01, 2022

While user_events API is under development and has been marked for broken
to not let the API become fixed, move the header file out of the uapi
directory. This is to prevent it from being installed, then later changed,
and then have an old distro user space update with a new kernel, where
applications see the user_events being available, but the old header is in
place, and then they get compiled incorrectly.

Also, surround the include with CONFIG_COMPILE_TEST to the current
location, but when the BROKEN tag is taken off, it will use the uapi
directory, and fail to compile. This is a good way to remind us to move
the header back.

Link: https://lore.kernel.org/all/20220330155835.5e1f6669@gandalf.local.home
Link: https://lkml.kernel.org/r/20220330201755.29319-1-mathieu.desnoyers@efficios.com
Link: https://lkml.kernel.org/r/20220401143903.188384f3@gandalf.local.homeSuggested-by: NMathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: NSteven Rostedt (Google) <rostedt@goodmis.org>

5cfff569

tracing/user_events: Remove eBPF interfaces · 768c1e7f

由 Beau Belgrave 提交于 3月 29, 2022

Remove eBPF interfaces within user_events to ensure they are fully
reviewed.

Link: https://lore.kernel.org/all/20220329165718.GA10381@kbox/
Link: https://lkml.kernel.org/r/20220329173051.10087-1-beaub@linux.microsoft.comSuggested-by: NAlexei Starovoitov <alexei.starovoitov@gmail.com>
Signed-off-by: NBeau Belgrave <beaub@linux.microsoft.com>
Signed-off-by: NSteven Rostedt (Google) <rostedt@goodmis.org>

768c1e7f

30 3月, 2022 1 次提交

loop: fix ioctl calls using compat_loop_info · f941c51e

由 Carlos Llamas 提交于 3月 29, 2022

Support for cryptoloop was deleted in commit 47e96246 ("block:
remove support for cryptoloop and the xor transfer"), making the usage
of loop_info->lo_encrypt_type obsolete. However, this member was also
removed from the compat_loop_info definition and this breaks userspace
ioctl calls for 32-bit binaries and CONFIG_COMPAT=y.

This patch restores the compat_loop_info->lo_encrypt_type member and
marks it obsolete as well as in the uapi header definitions.

Fixes: 47e96246 ("block: remove support for cryptoloop and the xor transfer")
Signed-off-by: NCarlos Llamas <cmllamas@google.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20220329201815.1347500-1-cmllamas@google.comSigned-off-by: NJens Axboe <axboe@kernel.dk>

f941c51e

29 3月, 2022 5 次提交

vdpa: support exposing the count of vqs to userspace · b04d910a

由 Longpeng 提交于 3月 15, 2022

- GET_VQS_COUNT: the count of virtqueues that exposed
Signed-off-by: NLongpeng <longpeng2@huawei.com>
Link: https://lore.kernel.org/r/20220315032553.455-4-longpeng2@huawei.comSigned-off-by: NMichael S. Tsirkin <mst@redhat.com>
Acked-by: NJason Wang <jasowang@redhat.com>
Signed-off-by: Longpeng &lt;<a href="mailto:longpeng2@huawei.com" target="_blank">longpeng2@huawei.com</a>&gt;<br>
Reviewed-by: NStefano Garzarella <sgarzare@redhat.com>

b04d910a

vdpa: support exposing the config size to userspace · a61280dd

由 Longpeng 提交于 3月 15, 2022

- GET_CONFIG_SIZE: return the size of the virtio config space.

The size contains the fields which are conditional on feature
bits.
Acked-by: NJason Wang <jasowang@redhat.com>
Signed-off-by: NLongpeng <longpeng2@huawei.com>
Link: https://lore.kernel.org/r/20220315032553.455-2-longpeng2@huawei.comSigned-off-by: NMichael S. Tsirkin <mst@redhat.com>
Reviewed-by: NStefano Garzarella <sgarzare@redhat.com>

a61280dd

virtio-crypto: introduce akcipher service · 24e19590

由 zhenwei pi 提交于 3月 02, 2022

Introduce asymmetric service definition, asymmetric operations and
several well known algorithms.
Co-developed-by: Nlei he <helei.sig11@bytedance.com>
Signed-off-by: Nlei he <helei.sig11@bytedance.com>
Signed-off-by: Nzhenwei pi <pizhenwei@bytedance.com>
Link: https://lore.kernel.org/r/20220302033917.1295334-3-pizhenwei@bytedance.comSigned-off-by: NMichael S. Tsirkin <mst@redhat.com>
Reviewed-by: NGonglei <arei.gonglei@huawei.com>

24e19590

virtio_crypto: Introduce VIRTIO_CRYPTO_NOSPC · 13d640a3

由 zhenwei pi 提交于 3月 02, 2022

Base on the lastest virtio crypto spec, define VIRTIO_CRYPTO_NOSPC.
Reviewed-by: NGonglei <arei.gonglei@huawei.com>
Signed-off-by: Nzhenwei pi <pizhenwei@bytedance.com>
Link: https://lore.kernel.org/r/20220302033917.1295334-2-pizhenwei@bytedance.comSigned-off-by: NMichael S. Tsirkin <mst@redhat.com>

13d640a3

Add definition of VIRTIO_F_IN_ORDER feature bit · 90a6951b

由 Gautam Dawar 提交于 2月 15, 2022

This patch adds the definition of VIRTIO_F_IN_ORDER feature bit
in the relevant header file to make it available in QEMU's
linux standard header file virtio_config.h, which is updated using
scripts/update-linux-headers.sh
Signed-off-by: NGautam Dawar <gdawar@xilinx.com>
Link: https://lore.kernel.org/r/20220215053430.24650-1-gdawar@xilinx.comSigned-off-by: NMichael S. Tsirkin <mst@redhat.com>
Acked-by: NJason Wang <jasowang@redhat.com>

90a6951b

24 3月, 2022 4 次提交

io_uring: remove IORING_CQE_F_MSG · 7ef66d18

由 Jens Axboe 提交于 3月 24, 2022

This was introduced with the message ring opcode, but isn't strictly
required for the request itself. The sender can encode what is needed
in user_data, which is passed to the receiver. It's unclear if having
a separate flag that essentially says "This CQE did not originate from
an SQE on this ring" provides any real utility to applications. While
we can always re-introduce a flag to provide this information, we cannot
take it away at a later point in time.

Remove the flag while we still can, before it's in a released kernel.
Signed-off-by: NJens Axboe <axboe@kernel.dk>

7ef66d18

Documentation/sparse: add hints about __CHECKER__ · 179fd6ba

由 Bjorn Helgaas 提交于 3月 23, 2022

Several attributes depend on __CHECKER__, but previously there was no
clue in the tree about when __CHECKER__ might be defined.  Add hints at
the most common places (__kernel, __user, __iomem, __bitwise) and in the
sparse documentation.

Link: https://lkml.kernel.org/r/20220310220927.245704-3-helgaas@kernel.orgSigned-off-by: NBjorn Helgaas <bhelgaas@google.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: "Michael S . Tsirkin" <mst@redhat.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

179fd6ba

linux/types.h: remove unnecessary __bitwise__ · c724c866

由 Bjorn Helgaas 提交于 3月 23, 2022

There are no users of "__bitwise__" except the definition of
"__bitwise".  Remove __bitwise__ and define __bitwise directly.

This is a follow-up to 05de9700 ("linux/types.h: enable endian
checks for all sparse builds").

[akpm@linux-foundation.org: change the tools/include/linux/types.h definition also]

Link: https://lkml.kernel.org/r/20220310220927.245704-2-helgaas@kernel.orgSigned-off-by: NBjorn Helgaas <bhelgaas@google.com>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

c724c866

rtc: add new RTC_FEATURE_ALARM_WAKEUP_ONLY feature · e99653af

由 Alexandre Belloni 提交于 3月 09, 2022

Some RTCs have an IRQ pin that is not connected to a CPU interrupt but
rather directly to a PMIC or power supply. In that case, it is still useful
to be able to set alarms but we shouldn't expect interrupts.
Signed-off-by: NAlexandre Belloni <alexandre.belloni@bootlin.com>
Link: https://lore.kernel.org/r/20220309162301.61679-22-alexandre.belloni@bootlin.com

e99653af

23 3月, 2022 1 次提交

userfaultfd: provide unmasked address on page-fault · 824ddc60

由 Nadav Amit 提交于 3月 22, 2022

Userfaultfd is supposed to provide the full address (i.e., unmasked) of
the faulting access back to userspace.  However, that is not the case for
quite some time.

Even running "userfaultfd_demo" from the userfaultfd man page provides the
wrong output (and contradicts the man page).  Notice that
"UFFD_EVENT_PAGEFAULT event" shows the masked address (7fc5e30b3000) and
not the first read address (0x7fc5e30b300f).

	Address returned by mmap() = 0x7fc5e30b3000

	fault_handler_thread():
	    poll() returns: nready = 1; POLLIN = 1; POLLERR = 0
	    UFFD_EVENT_PAGEFAULT event: flags = 0; address = 7fc5e30b3000
		(uffdio_copy.copy returned 4096)
	Read address 0x7fc5e30b300f in main(): A
	Read address 0x7fc5e30b340f in main(): A
	Read address 0x7fc5e30b380f in main(): A
	Read address 0x7fc5e30b3c0f in main(): A

The exact address is useful for various reasons and specifically for
prefetching decisions.  If it is known that the memory is populated by
certain objects whose size is not page-aligned, then based on the faulting
address, the uffd-monitor can decide whether to prefetch and prefault the
adjacent page.

This bug has been for quite some time in the kernel: since commit
1a29d85e ("mm: use vmf->address instead of of vmf->virtual_address")
vmf->virtual_address"), which dates back to 2016.  A concern has been
raised that existing userspace application might rely on the old/wrong
behavior in which the address is masked.  Therefore, it was suggested to
provide the masked address unless the user explicitly asks for the exact
address.

Add a new userfaultfd feature UFFD_FEATURE_EXACT_ADDRESS to direct
userfaultfd to provide the exact address.  Add a new "real_address" field
to vmf to hold the unmasked address.  Provide the address to userspace
accordingly.

Initialize real_address in various code-paths to be consistent with
address, even when it is not used, to be on the safe side.

[namit@vmware.com: initialize real_address on all code paths, per Jan]
  Link: https://lkml.kernel.org/r/20220226022655.350562-1-namit@vmware.com
[akpm@linux-foundation.org: fix typo in comment, per Jan]

Link: https://lkml.kernel.org/r/20220218041003.3508-1-namit@vmware.comSigned-off-by: NNadav Amit <namit@vmware.com>
Acked-by: NPeter Xu <peterx@redhat.com>
Reviewed-by: NDavid Hildenbrand <david@redhat.com>
Acked-by: NMike Rapoport <rppt@linux.ibm.com>
Reviewed-by: NJan Kara <jack@suse.cz>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

824ddc60

21 3月, 2022 2 次提交

KVM: x86: Introduce KVM_CAP_DISABLE_QUIRKS2 · 6d849191

由 Oliver Upton 提交于 3月 01, 2022

KVM_CAP_DISABLE_QUIRKS is irrevocably broken. The capability does not
advertise the set of quirks which may be disabled to userspace, so it is
impossible to predict the behavior of KVM. Worse yet,
KVM_CAP_DISABLE_QUIRKS will tolerate any value for cap->args[0], meaning
it fails to reject attempts to set invalid quirk bits.

The only valid workaround for the quirky quirks API is to add a new CAP.
Actually advertise the set of quirks that can be disabled to userspace
so it can predict KVM's behavior. Reject values for cap->args[0] that
contain invalid bits.

Finally, add documentation for the new capability and describe the
existing quirks.
Signed-off-by: NOliver Upton <oupton@google.com>
Message-Id: <20220301060351.442881-5-oupton@google.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

6d849191

bpf: Adjust BPF stack helper functions to accommodate skip > 0 · ee2a0988

由 Namhyung Kim 提交于 3月 14, 2022

Let's say that the caller has storage for num_elem stack frames.  Then,
the BPF stack helper functions walk the stack for only num_elem frames.
This means that if skip > 0, one keeps only 'num_elem - skip' frames.

This is because it sets init_nr in the perf_callchain_entry to the end
of the buffer to save num_elem entries only.  I believe it was because
the perf callchain code unwound the stack frames until it reached the
global max size (sysctl_perf_event_max_stack).

However it now has perf_callchain_entry_ctx.max_stack to limit the
iteration locally.  This simplifies the code to handle init_nr in the
BPF callstack entries and removes the confusion with the perf_event's
__PERF_SAMPLE_CALLCHAIN_EARLY which sets init_nr to 0.

Also change the comment on bpf_get_stack() in the header file to be
more explicit what the return value means.

Fixes: c195651e ("bpf: add bpf_get_stack helper")
Signed-off-by: NNamhyung Kim <namhyung@kernel.org>
Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
Acked-by: NYonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/30a7b5d5-6726-1cc2-eaee-8da2828a9a9c@oracle.com
Link: https://lore.kernel.org/bpf/20220314182042.71025-1-namhyung@kernel.orgBased-on-patch-by: NEugene Loh <eugene.loh@oracle.com>

ee2a0988

18 3月, 2022 8 次提交

ptrace: Move setting/clearing ptrace_message into ptrace_stop · 336d4b81

由 Eric W. Biederman 提交于 1月 27, 2022

Today ptrace_message is easy to overlook as it not a core part of
ptrace_stop. It has been overlooked so much that there are places
that set ptrace_message and don't clear it, and places that never set
it. So if you get an unlucky sequence of events the ptracer may be
able to read a ptrace_message that does not apply to the current
ptrace stop.

Move setting of ptrace_message into ptrace_stop so that it always gets
set before the stop, and always gets cleared after the stop. This
prevents non-sense from being reported to userspace and makes
ptrace_message more visible in the ptrace helper functions so that
kernel developers can see it.

Link: https://lkml.kernel.org/r/87bky67qfv.fsf_-_@email.froward.int.ebiederm.orgAcked-by: NOleg Nesterov <oleg@redhat.com>
Reviewed-by: NKees Cook <keescook@chromium.org>
Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>

336d4b81

counter: add new COUNTER_EVENT_CHANGE_OF_STATE · 73799a88

由 Oleksij Rempel 提交于 3月 15, 2022

Add new counter event to notify user space about every new counter
pulse.

Link: https://lore.kernel.org/r/20220203135727.2374052-2-o.rempel@pengutronix.deSigned-off-by: NOleksij Rempel <o.rempel@pengutronix.de>
Signed-off-by: NWilliam Breathitt Gray <vilhelm.gray@gmail.com>
Link: https://lore.kernel.org/r/486a5de67414470449efb84d06a2f2214f4bb31d.1647373009.git.vilhelm.gray@gmail.comSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

73799a88

rfkill: make new event layout opt-in · 54f586a9

由 Johannes Berg 提交于 3月 16, 2022

Again new complaints surfaced that we had broken the ABI here,
although previously all the userspace tools had agreed that it
was their mistake and fixed it. Yet now there are cases (e.g.
RHEL) that want to run old userspace with newer kernels, and
thus are broken.

Since this is a bit of a whack-a-mole thing, change the whole
extensibility scheme of rfkill to no longer just rely on the
message lengths, but instead require userspace to opt in via a
new ioctl to a given maximum event size that it is willing to
understand.

By default, set that to RFKILL_EVENT_SIZE_V1 (8), so that the
behaviour for userspace not calling the ioctl will look as if
it's just running on an older kernel.

Fixes: 14486c82 ("rfkill: add a reason to the HW rfkill state")
Cc: stable@vger.kernel.org # 5.11+
Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
Signed-off-by: NKalle Valo <kvalo@kernel.org>
Link: https://lore.kernel.org/r/20220316212749.16491491b270.Ifcb1950998330a596f29a2a162e00b7546a1d6d0@changeid

54f586a9

bpf: Add cookie support to programs attached with kprobe multi link · ca74823c

由 Jiri Olsa 提交于 3月 16, 2022

Adding support to call bpf_get_attach_cookie helper from
kprobe programs attached with kprobe multi link.

The cookie is provided by array of u64 values, where each
value is paired with provided function address or symbol
with the same array index.

When cookie array is provided it's sorted together with
addresses (check bpf_kprobe_multi_cookie_swap). This way
we can find cookie based on the address in
bpf_get_attach_cookie helper.
Suggested-by: NAndrii Nakryiko <andrii@kernel.org>
Signed-off-by: NJiri Olsa <jolsa@kernel.org>
Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20220316122419.933957-7-jolsa@kernel.org

ca74823c

bpf: Add multi kprobe link · 0dcac272

由 Jiri Olsa 提交于 3月 16, 2022

Adding new link type BPF_LINK_TYPE_KPROBE_MULTI that attaches kprobe
program through fprobe API.

The fprobe API allows to attach probe on multiple functions at once
very fast, because it works on top of ftrace. On the other hand this
limits the probe point to the function entry or return.

The kprobe program gets the same pt_regs input ctx as when it's attached
through the perf API.

Adding new attach type BPF_TRACE_KPROBE_MULTI that allows attachment
kprobe to multiple function with new link.

User provides array of addresses or symbols with count to attach the
kprobe program to. The new link_create uapi interface looks like:

  struct {
          __u32           flags;
          __u32           cnt;
          __aligned_u64   syms;
          __aligned_u64   addrs;
  } kprobe_multi;

The flags field allows single BPF_TRACE_KPROBE_MULTI bit to create
return multi kprobe.
Signed-off-by: NMasami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: NJiri Olsa <jolsa@kernel.org>
Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
Acked-by: NAndrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220316122419.933957-4-jolsa@kernel.org

0dcac272

net: bridge: mst: Support setting and reporting MST port states · 122c2948

由 Tobias Waldekranz 提交于 3月 16, 2022

Make it possible to change the port state in a given MSTI by extending
the bridge port netlink interface (RTM_SETLINK on PF_BRIDGE).The
proposed iproute2 interface would be:

    bridge mst set dev <PORT> msti <MSTI> state <STATE>

Current states in all applicable MSTIs can also be dumped via a
corresponding RTM_GETLINK. The proposed iproute interface looks like
this:

$ bridge mst
port              msti
vb1               0
		    state forwarding
		  100
		    state disabled
vb2               0
		    state forwarding
		  100
		    state forwarding

The preexisting per-VLAN states are still valid in the MST
mode (although they are read-only), and can be queried as usual if one
is interested in knowing a particular VLAN's state without having to
care about the VID to MSTI mapping (in this example VLAN 20 and 30 are
bound to MSTI 100):

$ bridge -d vlan
port              vlan-id
vb1               10
		    state forwarding mcast_router 1
		  20
		    state disabled mcast_router 1
		  30
		    state disabled mcast_router 1
		  40
		    state forwarding mcast_router 1
vb2               10
		    state forwarding mcast_router 1
		  20
		    state forwarding mcast_router 1
		  30
		    state forwarding mcast_router 1
		  40
		    state forwarding mcast_router 1
Signed-off-by: NTobias Waldekranz <tobias@waldekranz.com>
Acked-by: NNikolay Aleksandrov <razor@blackwall.org>
Signed-off-by: NJakub Kicinski <kuba@kernel.org>

122c2948

net: bridge: mst: Allow changing a VLAN's MSTI · 8c678d60

由 Tobias Waldekranz 提交于 3月 16, 2022

Allow a VLAN to move out of the CST (MSTI 0), to an independent tree.

The user manages the VID to MSTI mappings via a global VLAN
setting. The proposed iproute2 interface would be:

    bridge vlan global set dev br0 vid <VID> msti <MSTI>

Changing the state in non-zero MSTIs is still not supported, but will
be addressed in upcoming changes.
Signed-off-by: NTobias Waldekranz <tobias@waldekranz.com>
Acked-by: NNikolay Aleksandrov <razor@blackwall.org>
Signed-off-by: NJakub Kicinski <kuba@kernel.org>

8c678d60

net: bridge: mst: Multiple Spanning Tree (MST) mode · ec7328b5

由 Tobias Waldekranz 提交于 3月 16, 2022

Allow the user to switch from the current per-VLAN STP mode to an MST
mode.

Up to this point, per-VLAN STP states where always isolated from each
other. This is in contrast to the MSTP standard (802.1Q-2018, Clause
13.5), where VLANs are grouped into MST instances (MSTIs), and the
state is managed on a per-MSTI level, rather that at the per-VLAN
level.

Perhaps due to the prevalence of the standard, many switching ASICs
are built after the same model. Therefore, add a corresponding MST
mode to the bridge, which we can later add offloading support for in a
straight-forward way.

For now, all VLANs are fixed to MSTI 0, also called the Common
Spanning Tree (CST). That is, all VLANs will follow the port-global
state.

Upcoming changes will make this actually useful by allowing VLANs to
be mapped to arbitrary MSTIs and allow individual MSTI states to be
changed.
Signed-off-by: NTobias Waldekranz <tobias@waldekranz.com>
Acked-by: NNikolay Aleksandrov <razor@blackwall.org>
Signed-off-by: NJakub Kicinski <kuba@kernel.org>

ec7328b5

17 3月, 2022 1 次提交

net: geneve: support IPv4/IPv6 as inner protocol · 435fe1c0

由 Eyal Birger 提交于 3月 16, 2022

This patch adds support for encapsulating IPv4/IPv6 within GENEVE.

In order to use this, a new IFLA_GENEVE_INNER_PROTO_INHERIT flag needs
to be provided at device creation. This property cannot be changed for
the time being.

In case IP traffic is received on a non-tun device the drop count is
increased.
Signed-off-by: NEyal Birger <eyal.birger@gmail.com>
Link: https://lore.kernel.org/r/20220316061557.431872-1-eyal.birger@gmail.comSigned-off-by: NPaolo Abeni <pabeni@redhat.com>

435fe1c0

16 3月, 2022 1 次提交

drm/amdkfd: CRIU export dmabuf handles for GTT BOs · 65722ff6

由 David Yat Sin 提交于 3月 08, 2022

Export dmabuf handles for GTT BOs so that their contents can be accessed
using SDMA during checkpoint/restore.

v2: Squash in fix from David to set dmabuf handle to invalid for BOs
that cannot be accessed using SDMA during checkpoint/restore.
Signed-off-by: NDavid Yat Sin <david.yatsin@amd.com>
Reviewed-by : Rajneesh Bhardwaj <rajneesh.bhardwaj@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

65722ff6

14 3月, 2022 5 次提交

btrfs: add definitions and documentation for encoded I/O ioctls · dcb77a9a

由 Omar Sandoval 提交于 8月 16, 2021

In order to allow sending and receiving compressed data without
decompressing it, we need an interface to write pre-compressed data
directly to the filesystem and the matching interface to read compressed
data without decompressing it. This adds the definitions for ioctls to
do that and detailed explanations of how to use them.
Reviewed-by: NNikolay Borisov <nborisov@suse.com>
Signed-off-by: NOmar Sandoval <osandov@fb.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

dcb77a9a

btrfs: add code to support the block group root · 9c54e80d

由 Josef Bacik 提交于 12月 15, 2021

This code adds the on disk structures for the block group root, which
will hold the block group items for extent tree v2.
Signed-off-by: NJosef Bacik <josef@toxicpanda.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

9c54e80d

btrfs: add definition for EXTENT_TREE_V2 · 2c7d2a23

由 Josef Bacik 提交于 12月 15, 2021

This adds the initial definition of the EXTENT_TREE_V2 incompat feature
flag.  This also hides the support behind CONFIG_BTRFS_DEBUG.

THIS IS A IN DEVELOPMENT FORMAT CHANGE, DO NOT USE UNLESS YOU ARE A
DEVELOPER OR A TESTER.

The format is in flux and will be added in stages, any fs will need to
be re-made between updates to the format.
Signed-off-by: NJosef Bacik <josef@toxicpanda.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

2c7d2a23

NFS: Remove remaining dfprintks related to fscache and remove NFSDBG_FSCACHE · b5fdf66f

由 Dave Wysochanski 提交于 3月 01, 2022

The fscache cookie APIs including fscache_acquire_cookie() and
fscache_relinquish_cookie() now have very good tracing.  Thus,
there is no real need for dfprintks in the NFS fscache interface.

The NFS fscache interface has removed all dfprintks so remove the
NFSDBG_FSCACHE defines.
Signed-off-by: NDave Wysochanski <dwysocha@redhat.com>
Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>

b5fdf66f

rpmsg: ctrl: Introduce new RPMSG_CREATE/RELEASE_DEV_IOCTL controls · 8109517b

由 Arnaud Pouliquen 提交于 1月 24, 2022

Allow the user space application to create and release an rpmsg device
by adding RPMSG_CREATE_DEV_IOCTL and RPMSG_RELEASE_DEV_IOCTL ioctrls to
the /dev/rpmsg_ctrl interface

The RPMSG_CREATE_DEV_IOCTL Ioctl can be used to instantiate a local rpmsg
device.
Depending on the back-end implementation, the associated rpmsg driver is
probed and a NS announcement can be sent to the remote processor.

The RPMSG_RELEASE_DEV_IOCTL allows the user application to release a
rpmsg device created either by the remote processor or with the
RPMSG_CREATE_DEV_IOCTL call.
Depending on the back-end implementation, the associated rpmsg driver is
removed and a NS destroy rpmsg can be sent to the remote processor.
Suggested-by: NMathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: NArnaud Pouliquen <arnaud.pouliquen@foss.st.com>
Reviewed-by: NMathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: NBjorn Andersson <bjorn.andersson@linaro.org>
Link: https://lore.kernel.org/r/20220124102524.295783-12-arnaud.pouliquen@foss.st.com

8109517b

12 3月, 2022 5 次提交

nvdimm/region: Delete nd_blk_region infrastructure · 3b6c6c03

由 Dan Williams 提交于 3月 09, 2022

Now that the nd_namespace_blk infrastructure is removed, delete all the
region machinery to coordinate provisioning aliased capacity between
PMEM and BLK.
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/164688418803.2879318.1302315202397235855.stgit@dwillia2-desk3.amr.corp.intel.comSigned-off-by: NDan Williams <dan.j.williams@intel.com>

3b6c6c03

net/sched: Allow flower to match on GTP options · e3acda7a

由 Wojciech Drewek 提交于 3月 04, 2022

Options are as follows: PDU_TYPE:QFI and they refernce to
the fields from the  PDU Session Protocol. PDU Session data
is conveyed in GTP-U Extension Header.

GTP-U Extension Header is described in 3GPP TS 29.281.
PDU Session Protocol is described in 3GPP TS 38.415.

PDU_TYPE -  indicates the type of the PDU Session Information (4 bits)
QFI      -  QoS Flow Identifier (6 bits)

  # ip link add gtp_dev type gtp role sgsn
  # tc qdisc add dev gtp_dev ingress
  # tc filter add dev gtp_dev protocol ip parent ffff: \
      flower \
        enc_key_id 11 \
        gtp_opts 1:8/ff:ff \
      action mirred egress redirect dev eth0
Signed-off-by: NWojciech Drewek <wojciech.drewek@intel.com>
Signed-off-by: NTony Nguyen <anthony.l.nguyen@intel.com>

e3acda7a

gtp: Implement GTP echo request · d33bd757

由 Wojciech Drewek 提交于 3月 04, 2022

Adding GTP device through ip link creates the situation where
GTP instance is not able to send GTP echo requests.
Echo requests are used to check if GTP peer is still alive.
With this patch, gtp_genl_ops are extended by new cmd (GTP_CMD_ECHOREQ)
which allows to send echo request in the given version of GTP
protocol (v0 or v1), from the given ms address to he given
peer. TID is not inclued because in all path management
messages it should be equal to 0.

When GTP echo response is detected, multicast message is
send to everyone in the gtp_genl_family. Message contains
GTP version, ms address and peer address.
Suggested-by: NHarald Welte <laforge@gnumonks.org>
Signed-off-by: NWojciech Drewek <wojciech.drewek@intel.com>
Reviewed-by: NHarald Welte <laforge@gnumonks.org>
Signed-off-by: NTony Nguyen <anthony.l.nguyen@intel.com>

d33bd757

gtp: Implement GTP echo response · 9af41cc3

由 Wojciech Drewek 提交于 3月 04, 2022

Adding GTP device through ip link creates the situation where
there is no userspace daemon which would handle GTP messages
(Echo Request for example). GTP-U instance which would not respond
to echo requests would violate GTP specification.

When GTP packet arrives with GTP_ECHO_REQ message type,
GTP_ECHO_RSP is send to the sender. GTP_ECHO_RSP message
should contain information element with GTPIE_RECOVERY tag and
restart counter value. For GTPv1 restart counter is not used
and should be equal to 0, for GTPv0 restart counter contains
information provided from userspace(IFLA_GTP_RESTART_COUNT).
Signed-off-by: NWojciech Drewek <wojciech.drewek@intel.com>
Suggested-by: NHarald Welte <laforge@gnumonks.org>
Reviewed-by: NHarald Welte <laforge@gnumonks.org>
Tested-by: NHarald Welte <laforge@gnumonks.org>
Signed-off-by: NTony Nguyen <anthony.l.nguyen@intel.com>

9af41cc3

gtp: Allow to create GTP device without FDs · b20dc3c6

由 Wojciech Drewek 提交于 3月 04, 2022

Currently, when the user wants to create GTP device, he has to
provide file handles to the sockets created in userspace (IFLA_GTP_FD0,
IFLA_GTP_FD1). This behaviour is not ideal, considering the option of
adding support for GTP device creation through ip link. Ip link
application is not a good place to create such sockets.

This patch allows to create GTP device without providing
IFLA_GTP_FD0 and IFLA_GTP_FD1 arguments. If the user sets
IFLA_GTP_CREATE_SOCKETS attribute, then GTP module takes care
of creating UDP sockets by itself. Sockets are created with the
commonly known UDP ports used for GTP protocol (GTP0_PORT and
GTP1U_PORT). In this case we don't have to provide encap_destroy
because no extra deinitialization is needed, everything is covered
by udp_tunnel_sock_release.

Note: GTP instance created with only this change applied, does
not handle GTP Echo Requests. This is implemented in the following
patch.
Signed-off-by: NWojciech Drewek <wojciech.drewek@intel.com>
Signed-off-by: NTony Nguyen <anthony.l.nguyen@intel.com>

b20dc3c6

openeuler / Kernel 大约 2 年 前同步成功

openeuler / Kernel
大约 2 年前同步成功