提交 · fa86ebfcee99d83bf5b8080556e006473186a6f2 · openeuler / Kernel

28 2月, 2023 32 次提交

coredump: Limit what can interrupt coredumps · fa86ebfc

由 Eric W. Biederman 提交于 2月 28, 2023

stable inclusion
from stable-v5.10.162
commit 4b4d2c79921a8327e9e853dc93c06b27edb3a859
category: bugfix
bugzilla: https://gitee.com/src-openeuler/kernel/issues/I6BTWC
CVE: NA

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?h=v5.10.168&id=4b4d2c79921a8327e9e853dc93c06b27edb3a859

--------------------------------

[ Upstream commit 06af8679 ]

Olivier Langlois has been struggling with coredumps being incompletely written in
processes using io_uring.

Olivier Langlois <olivier@trillion01.com> writes:
> io_uring is a big user of task_work and any event that io_uring made a
> task waiting for that occurs during the core dump generation will
> generate a TIF_NOTIFY_SIGNAL.
>
> Here are the detailed steps of the problem:
> 1. io_uring calls vfs_poll() to install a task to a file wait queue
>    with io_async_wake() as the wakeup function cb from io_arm_poll_handler()
> 2. wakeup function ends up calling task_work_add() with TWA_SIGNAL
> 3. task_work_add() sets the TIF_NOTIFY_SIGNAL bit by calling
>    set_notify_signal()

The coredump code deliberately supports being interrupted by SIGKILL,
and depends upon prepare_signal to filter out all other signals.   Now
that signal_pending includes wake ups for TIF_NOTIFY_SIGNAL this hack
in dump_emitted by the coredump code no longer works.

Make the coredump code more robust by explicitly testing for all of
the wakeup conditions the coredump code supports.  This prevents
new wakeup conditions from breaking the coredump code, as well
as fixing the current issue.

The filesystem code that the coredump code uses already limits
itself to only aborting on fatal_signal_pending.  So it should
not develop surprising wake-up reasons either.

v2: Don't remove the now unnecessary code in prepare_signal.

Cc: stable@vger.kernel.org
Fixes: 12db8b69 ("entry: Add support for TIF_NOTIFY_SIGNAL")
Reported-by: NOlivier Langlois <olivier@trillion01.com>
Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NLi Lingfeng <lilingfeng3@huawei.com>
Acked-by: NZhang Yi <yi.zhang@huawei.com>
Reviewed-by: NWang Weiyang <wangweiyang2@huawei.com>
Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com>

fa86ebfc

kernel: provide create_io_thread() helper · af6e1dbc

由 Jens Axboe 提交于 2月 28, 2023

stable inclusion
from stable-v5.10.162
commit 1500fed00878fd59b2d6a1832b8d3f7c261a5671
category: bugfix
bugzilla: https://gitee.com/src-openeuler/kernel/issues/I6BTWC
CVE: NA

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?h=v5.10.168&id=1500fed00878fd59b2d6a1832b8d3f7c261a5671

--------------------------------

[ Upstream commit cc440e87 ]

Provide a generic helper for setting up an io_uring worker. Returns a
task_struct so that the caller can do whatever setup is needed, then call
wake_up_new_task() to kick it into gear.

Add a kernel_clone_args member, io_thread, which tells copy_process() to
mark the task with PF_IO_WORKER.
Signed-off-by: NJens Axboe <axboe@kernel.dk>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NLi Lingfeng <lilingfeng3@huawei.com>
Reviewed-by: NZhang Yi <yi.zhang@huawei.com>
Reviewed-by: NWang Weiyang <wangweiyang2@huawei.com>
Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com>

af6e1dbc

fs: provide locked helper variant of close_fd_get_file() · de68e074

由 Jens Axboe 提交于 2月 28, 2023

stable inclusion
from stable-v5.10.162
commit d2136fc145be417e851dfe50703fac2af6aabe46
category: bugfix
bugzilla: https://gitee.com/src-openeuler/kernel/issues/I6BTWC
CVE: NA

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?h=v5.10.168&id=d2136fc145be417e851dfe50703fac2af6aabe46

--------------------------------

[ Upstream commit 53dec2ea ]

Assumes current->files->file_lock is already held on invocation. Helps
the caller check the file before removing the fd, if it needs to.
Signed-off-by: NJens Axboe <axboe@kernel.dk>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NLi Lingfeng <lilingfeng3@huawei.com>
Reviewed-by: NZhang Yi <yi.zhang@huawei.com>
Reviewed-by: NWang Weiyang <wangweiyang2@huawei.com>
Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com>

de68e074

kernel: remove checking for TIF_NOTIFY_SIGNAL · 077bc296

由 Jens Axboe 提交于 2月 28, 2023

stable inclusion
from stable-v5.10.162
commit 90a2c3821bbfe8435bde901953871576a1bf8c6d
category: bugfix
bugzilla: https://gitee.com/src-openeuler/kernel/issues/I6BTWC
CVE: NA

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?h=v5.10.168&id=90a2c3821bbfe8435bde901953871576a1bf8c6d

--------------------------------

[ Upstream commit e296dc49 ]

It's available everywhere now, no need to check or add dummy defines.
Signed-off-by: NJens Axboe <axboe@kernel.dk>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NLi Lingfeng <lilingfeng3@huawei.com>
Reviewed-by: NZhang Yi <yi.zhang@huawei.com>
Reviewed-by: NWang Weiyang <wangweiyang2@huawei.com>
Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com>

077bc296

entry: Add support for TIF_NOTIFY_SIGNAL · ad404010

由 Jens Axboe 提交于 2月 28, 2023

stable inclusion
from stable-v5.10.162
commit 3c295bd2ddaecf3509458c86bf7ba610042f3609
category: bugfix
bugzilla: https://gitee.com/src-openeuler/kernel/issues/I6BTWC
CVE: NA

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?h=v5.10.168&id=3c295bd2ddaecf3509458c86bf7ba610042f3609

--------------------------------

[ Upstream commit 12db8b69 ]

Add TIF_NOTIFY_SIGNAL handling in the generic entry code, which if set,
will return true if signal_pending() is used in a wait loop. That causes an
exit of the loop so that notify_signal tracehooks can be run. If the wait
loop is currently inside a system call, the system call is restarted once
task_work has been processed.

In preparation for only having arch_do_signal() handle syscall restarts if
_TIF_SIGPENDING isn't set, rename it to arch_do_signal_or_restart().  Pass
in a boolean that tells the architecture specific signal handler if it
should attempt to get a signal, or just process a potential syscall
restart.

For !CONFIG_GENERIC_ENTRY archs, add the TIF_NOTIFY_SIGNAL handling to
get_signal(). This is done to minimize the needed architecture changes to
support this feature.
Signed-off-by: NJens Axboe <axboe@kernel.dk>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Reviewed-by: NOleg Nesterov <oleg@redhat.com>
Link: https://lore.kernel.org/r/20201026203230.386348-3-axboe@kernel.dkSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

Conflict:
	include/linux/tracehook.h
Signed-off-by: NLi Lingfeng <lilingfeng3@huawei.com>
Reviewed-by: Ntanghui <tanghui20@huawei.com>
Reviewed-by: NWang Weiyang <wangweiyang2@huawei.com>
Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com>

ad404010

signal: Add task_sigpending() helper · 25f86b0d

由 Jens Axboe 提交于 2月 28, 2023

stable inclusion
from stable-v5.10.162
commit 52cfde6bbf64f7f058efa805c6dbb6332d4de6fa
category: bugfix
bugzilla: https://gitee.com/src-openeuler/kernel/issues/I6BTWC
CVE: NA

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?h=v5.10.168&id=52cfde6bbf64f7f058efa805c6dbb6332d4de6fa

--------------------------------

[ Upstream commit 5c251e9d ]

This is in preparation for maintaining signal_pending() as the decider of
whether or not a schedule() loop should be broken, or continue sleeping.
This is different than the core signal use cases, which really need to know
whether an actual signal is pending or not. task_sigpending() returns
non-zero if TIF_SIGPENDING is set.

Only core kernel use cases should care about the distinction between
the two, make sure those use the task_sigpending() helper.
Signed-off-by: NJens Axboe <axboe@kernel.dk>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Reviewed-by: NThomas Gleixner <tglx@linutronix.de>
Reviewed-by: NOleg Nesterov <oleg@redhat.com>
Link: https://lore.kernel.org/r/20201026203230.386348-2-axboe@kernel.dkSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NLi Lingfeng <lilingfeng3@huawei.com>
Reviewed-by: NZhang Yi <yi.zhang@huawei.com>
Reviewed-by: NWang Weiyang <wangweiyang2@huawei.com>
Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com>

25f86b0d

arm: add support for TIF_NOTIFY_SIGNAL · d6f84bb2

由 Jens Axboe 提交于 2月 28, 2023

stable inclusion
from stable-v5.10.162
commit 1bee9dbbcabbb77617fb257f964628b50ba2529c
category: bugfix
bugzilla: https://gitee.com/src-openeuler/kernel/issues/I6BTWC
CVE: NA

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?h=v5.10.168&id=1bee9dbbcabbb77617fb257f964628b50ba2529c

--------------------------------

[ Upstream commit 32d59773 ]

Wire up TIF_NOTIFY_SIGNAL handling for arm.

Cc: linux-arm-kernel@lists.infradead.org
Acked-by: NRussell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

Conflicts:
	arch/arm/include/asm/thread_info.h
Signed-off-by: NLi Lingfeng <lilingfeng3@huawei.com>
Reviewed-by: Ntanghui <tanghui20@huawei.com>
Reviewed-by: NWang Weiyang <wangweiyang2@huawei.com>
Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com>

d6f84bb2

arm64: add support for TIF_NOTIFY_SIGNAL · 63e80185

由 Jens Axboe 提交于 2月 28, 2023

stable inclusion
from stable-v5.10.162
commit 79a9991e87fe61ff2a8b5ac8c112b3ce3544cb53
category: bugfix
bugzilla: https://gitee.com/src-openeuler/kernel/issues/I6BTWC
CVE: NA

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?h=v5.10.168&id=79a9991e87fe61ff2a8b5ac8c112b3ce3544cb53

--------------------------------

Wire up TIF_NOTIFY_SIGNAL handling for arm64.

Cc: linux-arm-kernel@lists.infradead.org
Acked-by: NWill Deacon <will@kernel.org>
Acked-by: NCatalin Marinas <catalin.marinas@arm.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

Conflicts:
	arch/arm64/include/asm/thread_info.h
Signed-off-by: NLi Lingfeng <lilingfeng3@huawei.com>
Reviewed-by: Ntanghui <tanghui20@huawei.com>
Reviewed-by: NWang Weiyang <wangweiyang2@huawei.com>
Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com>

63e80185

riscv: add support for TIF_NOTIFY_SIGNAL · d6b6a363

由 Jens Axboe 提交于 2月 28, 2023

stable inclusion
from stable-v5.10.162
commit 78a53ff02656da3c1e6a3e11e30ab23cf4bcb0f7
category: bugfix
bugzilla: https://gitee.com/src-openeuler/kernel/issues/I6BTWC
CVE: NA

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?h=v5.10.168&id=78a53ff02656da3c1e6a3e11e30ab23cf4bcb0f7

--------------------------------

[ Upstream commit 24a31b81 ]

Wire up TIF_NOTIFY_SIGNAL handling for riscv.

Cc: linux-riscv@lists.infradead.org
Signed-off-by: NJens Axboe <axboe@kernel.dk>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NLi Lingfeng <lilingfeng3@huawei.com>
Reviewed-by: NZhang Yi <yi.zhang@huawei.com>
Reviewed-by: NWang Weiyang <wangweiyang2@huawei.com>
Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com>

d6b6a363

powerpc: add support for TIF_NOTIFY_SIGNAL · b9418d58

由 Jens Axboe 提交于 2月 28, 2023

stable inclusion
from stable-v5.10.162
commit abab3d4444b5f4732d741f45feb954fabb78af7f
category: bugfix
bugzilla: https://gitee.com/src-openeuler/kernel/issues/I6BTWC
CVE: NA

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?h=v5.10.168&id=abab3d4444b5f4732d741f45feb954fabb78af7f

--------------------------------

[ Upstream commit 900f0713 ]

Wire up TIF_NOTIFY_SIGNAL handling for powerpc.

Cc: linuxppc-dev@lists.ozlabs.org
Acked-by: NMichael Ellerman <mpe@ellerman.id.au>
Signed-off-by: NJens Axboe <axboe@kernel.dk>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NLi Lingfeng <lilingfeng3@huawei.com>
Reviewed-by: NZhang Yi <yi.zhang@huawei.com>
Reviewed-by: NWang Weiyang <wangweiyang2@huawei.com>
Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com>

b9418d58

x86: Wire up TIF_NOTIFY_SIGNAL · 5c3c010f

由 Jens Axboe 提交于 2月 28, 2023

stable inclusion
from stable-v5.10.162
commit 4b1dcf8ec9b2f11b57f1ff5dcaa1f8575c7dacb5
category: bugfix
bugzilla: https://gitee.com/src-openeuler/kernel/issues/I6BTWC
CVE: NA

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?h=v5.10.168&id=4b1dcf8ec9b2f11b57f1ff5dcaa1f8575c7dacb5

--------------------------------

[ Upstream commit c8d5ed67 ]

The generic entry code has support for TIF_NOTIFY_SIGNAL already. Just
provide the TIF bit.

[ tglx: Adopted to other TIF changes in x86 ]
Signed-off-by: NJens Axboe <axboe@kernel.dk>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20201026203230.386348-4-axboe@kernel.dkSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NLi Lingfeng <lilingfeng3@huawei.com>
Reviewed-by: NZhang Yi <yi.zhang@huawei.com>
Reviewed-by: NWang Weiyang <wangweiyang2@huawei.com>
Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com>

5c3c010f

iov_iter: add helper to save iov_iter state · 63da7f3d

由 Jens Axboe 提交于 2月 28, 2023

stable inclusion
from stable-v5.10.162
commit e86db87191d8f91dfe787758c6dd646c2bf0c335
category: bugfix
bugzilla: https://gitee.com/src-openeuler/kernel/issues/I6BTWC
CVE: NA

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?h=v5.10.168&id=e86db87191d8f91dfe787758c6dd646c2bf0c335

--------------------------------

[ Upstream commit 8fb0f47a ]

In an ideal world, when someone is passed an iov_iter and returns X bytes,
then X bytes would have been consumed/advanced from the iov_iter. But we
have use cases that always consume the entire iterator, a few examples
of that are iomap and bdev O_DIRECT. This means we cannot rely on the
state of the iov_iter once we've called ->read_iter() or ->write_iter().

This would be easier if we didn't always have to deal with truncate of
the iov_iter, as rewinding would be trivial without that. We recently
added a commit to track the truncate state, but that grew the iov_iter
by 8 bytes and wasn't the best solution.

Implement a helper to save enough of the iov_iter state to sanely restore
it after we've called the read/write iterator helpers. This currently
only works for IOVEC/BVEC/KVEC as that's all we need, support for other
iterator types are left as an exercise for the reader.

Link: https://lore.kernel.org/linux-fsdevel/CAHk-=wiacKV4Gh-MYjteU0LwNBSGpWrK-Ov25HdqB1ewinrFPg@mail.gmail.com/Signed-off-by: NJens Axboe <axboe@kernel.dk>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NLi Lingfeng <lilingfeng3@huawei.com>
Reviewed-by: NZhang Yi <yi.zhang@huawei.com>
Reviewed-by: NWang Weiyang <wangweiyang2@huawei.com>
Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com>

63da7f3d

perf beauty: Update copy of linux/socket.h with the kernel sources · fb233a88

由 Arnaldo Carvalho de Melo 提交于 2月 28, 2023

mainline inclusion
from mainline-v5.15-rc1
commit bb91de44
category: bugfix
bugzilla: https://gitee.com/src-openeuler/kernel/issues/I6BTWC

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?h=v6.2-rc7&id=bb91de44693b1c10fe2f9e668506b01e88efed0e

--------------------------------

To pick the changes in:

Fixes: d32f89da ("net: add accept helper not installing fd")
Fixes: bc49d816 ("mctp: Add MCTP base")

This automagically adds support for the AF_MCTP protocol domain:

  $ tools/perf/trace/beauty/socket.sh > before
  $ cp include/linux/socket.h tools/perf/trace/beauty/include/linux/socket.h
  $ tools/perf/trace/beauty/socket.sh > after
  $ diff -u before after
  --- before	2021-09-06 11:57:14.972747200 -0300
  +++ after	2021-09-06 11:57:30.541920222 -0300
  @@ -44,4 +44,5 @@
   	[42] = "QIPCRTR",
   	[43] = "SMC",
   	[44] = "XDP",
  +	[45] = "MCTP",
   };
  $

This will allow 'perf trace' to translate 45 into "MCTP" as is done with
the other domains:

  # perf trace -e socket*
     0.000 chronyd/1029 socket(family: INET, type: DGRAM|CLOEXEC|NONBLOCK, protocol: IP) = 4
  ^C#

This addresses this perf build warning:

  Warning: Kernel ABI header at 'tools/perf/trace/beauty/include/linux/socket.h' differs from latest version at 'include/linux/socket.h'
  diff -u tools/perf/trace/beauty/include/linux/socket.h include/linux/socket.h

Cc: David S. Miller <davem@davemloft.net>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Jeremy Kerr <jk@codeconstruct.com.au>
Cc: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>

Conflict:
	tools/perf/trace/beauty/include/linux/socket.h
Signed-off-by: NLi Lingfeng <lilingfeng3@huawei.com>
Reviewed-by: NZhang Yi <yi.zhang@huawei.com>
Reviewed-by: NWang Weiyang <wangweiyang2@huawei.com>
Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com>

fb233a88

perf trace beauty: Update copy of linux/socket.h with the kernel sources · fa5386c6

由 Arnaldo Carvalho de Melo 提交于 2月 28, 2023

mainline inclusion
from mainline-v5.11-rc1
commit eb2842da
category: bugfix
bugzilla: https://gitee.com/src-openeuler/kernel/issues/I6BTWC

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?h=v6.2-rc7&id=eb2842da77e1f7a3c46033f930524ab76dffe67a

--------------------------------

This just triggers the rebuilding of the syscall beautifiers that
extract patterns from this file due to this cset:

  b713c195 ("net: provide __sys_shutdown_sock() that takes a socket")

After updating it:

    CC       /tmp/build/perf/trace/beauty/sockaddr.o

Addressing this perf build warning:

  Warning: Kernel ABI header at 'tools/perf/trace/beauty/include/linux/socket.h' differs from latest version at 'include/linux/socket.h'
  diff -u tools/perf/trace/beauty/include/linux/socket.h include/linux/socket.h

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>

Conflict：
	tools/perf/trace/beauty/include/linux/socket.h
Signed-off-by: NLi Lingfeng <lilingfeng3@huawei.com>
Reviewed-by: NZhang Yi <yi.zhang@huawei.com>
Reviewed-by: NWang Weiyang <wangweiyang2@huawei.com>
Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com>

fa5386c6

io_uring: correct pinned_vm accounting · 2a539f2e

由 Pavel Begunkov 提交于 2月 28, 2023

stable inclusion
from stable-v5.10.150
commit 67cbc8865a66533fa08c1c13fe9acbaaae63c403
category: bugfix
bugzilla: https://gitee.com/src-openeuler/kernel/issues/I6BTWC
CVE: NA

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?h=v5.10.168&id=67cbc8865a66533fa08c1c13fe9acbaaae63c403

--------------------------------

[ upstream commit 42b6419d ]

->mm_account should be released only after we free all registered
buffers, otherwise __io_sqe_buffers_unregister() will see a NULL
->mm_account and skip locked_vm accounting.

Cc: <Stable@vger.kernel.org>
Signed-off-by: NPavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/6d798f65ed4ab8db3664c4d3397d4af16ca98846.1664849932.git.asml.silence@gmail.comSigned-off-by: NJens Axboe <axboe@kernel.dk>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NLi Lingfeng <lilingfeng3@huawei.com>
Reviewed-by: NZhang Yi <yi.zhang@huawei.com>
Reviewed-by: NWang Weiyang <wangweiyang2@huawei.com>
Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com>

2a539f2e

file: Rename __close_fd_get_file close_fd_get_file · 2d57ade4

由 Eric W. Biederman 提交于 2月 28, 2023

stable inclusion
from stable-v5.10.162
commit 57b20530363d127ab6a82e336275769258eb5f37
category: bugfix
bugzilla: https://gitee.com/src-openeuler/kernel/issues/I6BTWC
CVE: NA

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?h=v5.10.168&id=57b20530363d127ab6a82e336275769258eb5f37

--------------------------------

[ Upstream commit 9fe83c43 ]

The function close_fd_get_file is explicitly a variant of
__close_fd[1].  Now that __close_fd has been renamed close_fd, rename
close_fd_get_file to be consistent with close_fd.

When __alloc_fd, __close_fd and __fd_install were introduced the
double underscore indicated that the function took a struct
files_struct parameter.  The function __close_fd_get_file never has so
the naming has always been inconsistent.  This just cleans things up
so there are not any lingering mentions or references __close_fd left
in the code.

[1] 80cd7956 ("binder: fix use-after-free due to ksys_close() during fdget()")
Link: https://lkml.kernel.org/r/20201120231441.29911-23-ebiederm@xmission.comSigned-off-by: NEric W. Biederman <ebiederm@xmission.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NLi Lingfeng <lilingfeng3@huawei.com>
Reviewed-by: NZhang Yi <yi.zhang@huawei.com>
Reviewed-by: NWang Weiyang <wangweiyang2@huawei.com>
Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com>

2d57ade4

io_uring: don't hold uring_lock when calling io_run_task_work* · 8de731f2

由 Hao Xu 提交于 2月 28, 2023

stable inclusion
from stable-v5.10.158
commit a2efc465245e535fefcad8c4ed5967254344257d
category: bugfix
bugzilla: https://gitee.com/src-openeuler/kernel/issues/I6BTWC
CVE: NA

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?h=v5.10.168&id=a2efc465245e535fefcad8c4ed5967254344257d

--------------------------------

commit 8bad28d8 upstream.

Abaci reported the below issue:
[  141.400455] hrtimer: interrupt took 205853 ns
[  189.869316] process 'usr/local/ilogtail/ilogtail_0.16.26' started with executable stack
[  250.188042]
[  250.188327] ============================================
[  250.189015] WARNING: possible recursive locking detected
[  250.189732] 5.11.0-rc4 #1 Not tainted
[  250.190267] --------------------------------------------
[  250.190917] a.out/7363 is trying to acquire lock:
[  250.191506] ffff888114dbcbe8 (&ctx->uring_lock){+.+.}-{3:3}, at: __io_req_task_submit+0x29/0xa0
[  250.192599]
[  250.192599] but task is already holding lock:
[  250.193309] ffff888114dbfbe8 (&ctx->uring_lock){+.+.}-{3:3}, at: __x64_sys_io_uring_register+0xad/0x210
[  250.194426]
[  250.194426] other info that might help us debug this:
[  250.195238]  Possible unsafe locking scenario:
[  250.195238]
[  250.196019]        CPU0
[  250.196411]        ----
[  250.196803]   lock(&ctx->uring_lock);
[  250.197420]   lock(&ctx->uring_lock);
[  250.197966]
[  250.197966]  *** DEADLOCK ***
[  250.197966]
[  250.198837]  May be due to missing lock nesting notation
[  250.198837]
[  250.199780] 1 lock held by a.out/7363:
[  250.200373]  #0: ffff888114dbfbe8 (&ctx->uring_lock){+.+.}-{3:3}, at: __x64_sys_io_uring_register+0xad/0x210
[  250.201645]
[  250.201645] stack backtrace:
[  250.202298] CPU: 0 PID: 7363 Comm: a.out Not tainted 5.11.0-rc4 #1
[  250.203144] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
[  250.203887] Call Trace:
[  250.204302]  dump_stack+0xac/0xe3
[  250.204804]  __lock_acquire+0xab6/0x13a0
[  250.205392]  lock_acquire+0x2c3/0x390
[  250.205928]  ? __io_req_task_submit+0x29/0xa0
[  250.206541]  __mutex_lock+0xae/0x9f0
[  250.207071]  ? __io_req_task_submit+0x29/0xa0
[  250.207745]  ? 0xffffffffa0006083
[  250.208248]  ? __io_req_task_submit+0x29/0xa0
[  250.208845]  ? __io_req_task_submit+0x29/0xa0
[  250.209452]  ? __io_req_task_submit+0x5/0xa0
[  250.210083]  __io_req_task_submit+0x29/0xa0
[  250.210687]  io_async_task_func+0x23d/0x4c0
[  250.211278]  task_work_run+0x89/0xd0
[  250.211884]  io_run_task_work_sig+0x50/0xc0
[  250.212464]  io_sqe_files_unregister+0xb2/0x1f0
[  250.213109]  __io_uring_register+0x115a/0x1750
[  250.213718]  ? __x64_sys_io_uring_register+0xad/0x210
[  250.214395]  ? __fget_files+0x15a/0x260
[  250.214956]  __x64_sys_io_uring_register+0xbe/0x210
[  250.215620]  ? trace_hardirqs_on+0x46/0x110
[  250.216205]  do_syscall_64+0x2d/0x40
[  250.216731]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[  250.217455] RIP: 0033:0x7f0fa17e5239
[  250.218034] Code: 01 00 48 81 c4 80 00 00 00 e9 f1 fe ff ff 0f 1f 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05  3d 01 f0 ff ff 73 01 c3 48 8b 0d 27 ec 2c 00 f7 d8 64 89 01 48
[  250.220343] RSP: 002b:00007f0fa1eeac48 EFLAGS: 00000246 ORIG_RAX: 00000000000001ab
[  250.221360] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f0fa17e5239
[  250.222272] RDX: 0000000000000000 RSI: 0000000000000003 RDI: 0000000000000008
[  250.223185] RBP: 00007f0fa1eeae20 R08: 0000000000000000 R09: 0000000000000000
[  250.224091] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
[  250.224999] R13: 0000000000021000 R14: 0000000000000000 R15: 00007f0fa1eeb700

This is caused by calling io_run_task_work_sig() to do work under
uring_lock while the caller io_sqe_files_unregister() already held
uring_lock.
To fix this issue, briefly drop uring_lock when calling
io_run_task_work_sig(), and there are two things to concern:

- hold uring_lock in io_ring_ctx_free() around io_sqe_files_unregister()
    this is for consistency of lock/unlock.
- add new fixed rsrc ref node before dropping uring_lock
    it's not safe to do io_uring_enter-->percpu_ref_get() with a dying one.
- check if rsrc_data->refs is dying to avoid parallel io_sqe_files_unregister
Reported-by: NAbaci <abaci@linux.alibaba.com>
Fixes: 1ffc5422 ("io_uring: fix io_sqe_files_unregister() hangs")
Suggested-by: NPavel Begunkov <asml.silence@gmail.com>
Signed-off-by: NHao Xu <haoxu@linux.alibaba.com>
[axboe: fixes from Pavel folded in]
Signed-off-by: NJens Axboe <axboe@kernel.dk>
Signed-off-by: NSamiullah Khawaja <skhawaja@google.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NLi Lingfeng <lilingfeng3@huawei.com>
Reviewed-by: NZhang Yi <yi.zhang@huawei.com>
Reviewed-by: NWang Weiyang <wangweiyang2@huawei.com>
Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com>

8de731f2

io_uring: don't take uring_lock during iowq cancel · b6562159

由 Pavel Begunkov 提交于 2月 28, 2023

stable inclusion
from stable-v5.10.77
commit 3f2c12ec8a3f992c528c7ad83f7272122dfe8d84
category: bugfix
bugzilla: https://gitee.com/src-openeuler/kernel/issues/I6BTWC

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?h=v5.10.167&id=3f2c12ec8a3f992c528c7ad83f7272122dfe8d84

--------------------------------

commit 792bb6eb upstream.

[   97.866748] a.out/2890 is trying to acquire lock:
[   97.867829] ffff8881046763e8 (&ctx->uring_lock){+.+.}-{3:3}, at:
io_wq_submit_work+0x155/0x240
[   97.869735]
[   97.869735] but task is already holding lock:
[   97.871033] ffff88810dfe0be8 (&ctx->uring_lock){+.+.}-{3:3}, at:
__x64_sys_io_uring_enter+0x3f0/0x5b0
[   97.873074]
[   97.873074] other info that might help us debug this:
[   97.874520]  Possible unsafe locking scenario:
[   97.874520]
[   97.875845]        CPU0
[   97.876440]        ----
[   97.877048]   lock(&ctx->uring_lock);
[   97.877961]   lock(&ctx->uring_lock);
[   97.878881]
[   97.878881]  *** DEADLOCK ***
[   97.878881]
[   97.880341]  May be due to missing lock nesting notation
[   97.880341]
[   97.881952] 1 lock held by a.out/2890:
[   97.882873]  #0: ffff88810dfe0be8 (&ctx->uring_lock){+.+.}-{3:3}, at:
__x64_sys_io_uring_enter+0x3f0/0x5b0
[   97.885108]
[   97.885108] stack backtrace:
[   97.890457] Call Trace:
[   97.891121]  dump_stack+0xac/0xe3
[   97.891972]  __lock_acquire+0xab6/0x13a0
[   97.892940]  lock_acquire+0x2c3/0x390
[   97.894894]  __mutex_lock+0xae/0x9f0
[   97.901101]  io_wq_submit_work+0x155/0x240
[   97.902112]  io_wq_cancel_cb+0x162/0x490
[   97.904126]  io_async_find_and_cancel+0x3b/0x140
[   97.905247]  io_issue_sqe+0x86d/0x13e0
[   97.909122]  __io_queue_sqe+0x10b/0x550
[   97.913971]  io_queue_sqe+0x235/0x470
[   97.914894]  io_submit_sqes+0xcce/0xf10
[   97.917872]  __x64_sys_io_uring_enter+0x3fb/0x5b0
[   97.921424]  do_syscall_64+0x2d/0x40
[   97.922329]  entry_SYSCALL_64_after_hwframe+0x44/0xa9

While holding uring_lock, e.g. from inline execution, async cancel
request may attempt cancellations through io_wq_submit_work, which may
try to grab a lock. Delay it to task_work, so we do it from a clean
context and don't have to worry about locking.

Cc: <stable@vger.kernel.org> # 5.5+
Fixes: c07e6719 ("io_uring: hold uring_lock while completing failed polled io in io_wq_submit_work()")
Reported-by: NAbaci <abaci@linux.alibaba.com>
Reported-by: NHao Xu <haoxu@linux.alibaba.com>
Signed-off-by: NPavel Begunkov <asml.silence@gmail.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>
[Lee: The first hunk solves a different (double free) issue in v5.10.
      Only the first hunk of the original patch is relevant to v5.10 AND
      the first hunk of the original patch is only relevant to v5.10]
Reported-by: syzbot+59d8a1f4e60c20c066cf@syzkaller.appspotmail.com
Signed-off-by: NLee Jones <lee.jones@linaro.org>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NLi Lingfeng <lilingfeng3@huawei.com>
Reviewed-by: NZhang Yi <yi.zhang@huawei.com>
Reviewed-by: NWang Weiyang <wangweiyang2@huawei.com>
Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com>

b6562159

fs: make do_renameat2() take struct filename · cc698f72

由 Jens Axboe 提交于 2月 28, 2023

stable inclusion
from stable-v5.10.162
commit 214f80e25176eb4d756bc9fe528ef7bf23d2f9a1
category: bugfix
bugzilla: https://gitee.com/src-openeuler/kernel/issues/I6BTWC

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?h=v5.10.167&id=214f80e25176eb4d756bc9fe528ef7bf23d2f9a1

--------------------------------

[ Upstream commit e886663c ]

Pass in the struct filename pointers instead of the user string, and
update the three callers to do the same.

This behaves like do_unlinkat(), which also takes a filename struct and
puts it when it is done. Converting callers is then trivial.
Signed-off-by: NJens Axboe <axboe@kernel.dk>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NLi Lingfeng <lilingfeng3@huawei.com>
Reviewed-by: NZhang Yi <yi.zhang@huawei.com>
Reviewed-by: NWang Weiyang <wangweiyang2@huawei.com>
Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com>

cc698f72

net: add accept helper not installing fd · aed99eb7

由 Pavel Begunkov 提交于 2月 28, 2023

stable inclusion
from stable-v5.10.162
commit ad0b0137953a2c973958dadf6d222e120e278856
category: bugfix
bugzilla: https://gitee.com/src-openeuler/kernel/issues/I6BTWC

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?h=v5.10.167&id=ad0b0137953a2c973958dadf6d222e120e278856

--------------------------------

[ Upstream commit d32f89da ]

Introduce and reuse a helper that acts similarly to __sys_accept4_file()
but returns struct file instead of installing file descriptor. Will be
used by io_uring.
Signed-off-by: NPavel Begunkov <asml.silence@gmail.com>
Acked-by: NJakub Kicinski <kuba@kernel.org>
Signed-off-by: NJens Axboe <axboe@kernel.dk>
Acked-by: NDavid S. Miller <davem@davemloft.net>
Link: https://lore.kernel.org/r/c57b9e8e818d93683a3d24f8ca50ca038d1da8c4.1629888991.git.asml.silence@gmail.comSigned-off-by: NJens Axboe <axboe@kernel.dk>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NLi Lingfeng <lilingfeng3@huawei.com>
Reviewed-by: NZhang Yi <yi.zhang@huawei.com>
Reviewed-by: NWang Weiyang <wangweiyang2@huawei.com>
Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com>

aed99eb7

net: provide __sys_shutdown_sock() that takes a socket · bbd64b56

由 Jens Axboe 提交于 2月 28, 2023

stable inclusion
from stable-v5.10.162
commit 069ac28d92432dd7cdac0a2c141a1b3b8d4330d5
category: bugfix
bugzilla: https://gitee.com/src-openeuler/kernel/issues/I6BTWC

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?h=v5.10.167&id=069ac28d92432dd7cdac0a2c141a1b3b8d4330d5

--------------------------------

[ Upstream commit b713c195 ]

No functional changes in this patch, needed to provide io_uring support
for shutdown(2).

Cc: netdev@vger.kernel.org
Cc: David S. Miller <davem@davemloft.net>
Acked-by: NJakub Kicinski <kuba@kernel.org>
Signed-off-by: NJens Axboe <axboe@kernel.dk>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NLi Lingfeng <lilingfeng3@huawei.com>
Reviewed-by: NZhang Yi <yi.zhang@huawei.com>
Reviewed-by: NWang Weiyang <wangweiyang2@huawei.com>
Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com>

bbd64b56

fs: expose LOOKUP_CACHED through openat2() RESOLVE_CACHED · beaa0ca3

由 Jens Axboe 提交于 2月 28, 2023

stable inclusion
from stable-v5.10.162
commit 5683caa7350f389d099b72bfdb289d2073286e32
category: bugfix
bugzilla: https://gitee.com/src-openeuler/kernel/issues/I6BTWC

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?h=v5.10.167&id=5683caa7350f389d099b72bfdb289d2073286e32

--------------------------------

[ Upstream commit 99668f61 ]

Now that we support non-blocking path resolution internally, expose it
via openat2() in the struct open_how ->resolve flags. This allows
applications using openat2() to limit path resolution to the extent that
it is already cached.

If the lookup cannot be satisfied in a non-blocking manner, openat2(2)
will return -1/-EAGAIN.

Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: NJens Axboe <axboe@kernel.dk>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NLi Lingfeng <lilingfeng3@huawei.com>
Reviewed-by: NZhang Yi <yi.zhang@huawei.com>
Reviewed-by: NWang Weiyang <wangweiyang2@huawei.com>
Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com>

beaa0ca3

Make sure nd->path.mnt and nd->path.dentry are always valid pointers · df33726f

由 Al Viro 提交于 2月 28, 2023

stable inclusion
from stable-v5.10.162
commit 0cf0ce8fb5b10d669072345ea855de112d0e0a43
category: bugfix
bugzilla: https://gitee.com/src-openeuler/kernel/issues/I6BTWC

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?h=v5.10.167&id=0cf0ce8fb5b10d669072345ea855de112d0e0a43

--------------------------------

[ Upstream commit 7d01ef75 ]

Initialize them in set_nameidata() and make sure that terminate_walk() clears them
once the pointers become potentially invalid (i.e. we leave RCU mode or drop them
in non-RCU one).  Currently we have "path_init() always initializes them and nobody
accesses them outside of path_init()/terminate_walk() segments", which is asking
for trouble.

With that change we would have nd->path.{mnt,dentry}
	1) always valid - NULL or pointing to currently allocated objects.
	2) non-NULL while we are successfully walking
	3) NULL when we are not walking at all
	4) contributing to refcounts whenever non-NULL outside of RCU mode.

Fixes: 6c6ec2b0 ("fs: add support for LOOKUP_CACHED")
Reported-by: syzbot+c88a7030da47945a3cc3@syzkaller.appspotmail.com
Tested-by: NChristian Brauner <christian.brauner@ubuntu.com>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

Conflict：
	fs/namei.c
Signed-off-by: NLi Lingfeng <lilingfeng3@huawei.com>
Reviewed-by: NZhang Yi <yi.zhang@huawei.com>
Reviewed-by: NWang Weiyang <wangweiyang2@huawei.com>
Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com>

df33726f

fix handling of nd->depth on LOOKUP_CACHED failures in try_to_unlazy* · 8c4b5fe6

由 Al Viro 提交于 2月 28, 2023

stable inclusion
from stable-v5.10.162
commit 146fe79fff13fea7b5f3a9e913689e07fd4e6432
category: bugfix
bugzilla: https://gitee.com/src-openeuler/kernel/issues/I6BTWC

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?h=v5.10.167&id=146fe79fff13fea7b5f3a9e913689e07fd4e6432

--------------------------------

[ Upstream commit eacd9aa8 ]

After switching to non-RCU mode, we want nd->depth to match the number
of entries in nd->stack[] that need eventual path_put().
legitimize_links() takes care of that on failures; unfortunately,
failure exits added for LOOKUP_CACHED do not.

We could add the logics for that into those failure exits, both in
try_to_unlazy() and in try_to_unlazy_next(), but since both checks
are immediately followed by legitimize_links() and there's no calls
of legitimize_links() other than those two...  It's easier to
move the check (and required handling of nd->depth on failure) into
legitimize_links() itself.

[caught by Jens: ... and since we are zeroing ->depth here, we need
to do drop_links() first]

Fixes: 6c6ec2b0 "fs: add support for LOOKUP_CACHED"
Tested-by: NJens Axboe <axboe@kernel.dk>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NLi Lingfeng <lilingfeng3@huawei.com>
Reviewed-by: NZhang Yi <yi.zhang@huawei.com>
Reviewed-by: NWang Weiyang <wangweiyang2@huawei.com>
Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com>

8c4b5fe6

fs: add support for LOOKUP_CACHED · e489dea3

由 Jens Axboe 提交于 2月 28, 2023

stable inclusion
from stable-v5.10.162
commit c1fe7bd3e1aa85865396b464b31f28b094a4353c
category: bugfix
bugzilla: https://gitee.com/src-openeuler/kernel/issues/I6BTWC

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?h=v5.10.167&id=c1fe7bd3e1aa85865396b464b31f28b094a4353c

--------------------------------

[ Upstream commit 6c6ec2b0 ]

io_uring always punts opens to async context, since there's no control
over whether the lookup blocks or not. Add LOOKUP_CACHED to support
just doing the fast RCU based lookups, which we know will not block. If
we can do a cached path resolution of the filename, then we don't have
to always punt lookups for a worker.

During path resolution, we always do LOOKUP_RCU first. If that fails and
we terminate LOOKUP_RCU, then fail a LOOKUP_CACHED attempt as well.

Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: NJens Axboe <axboe@kernel.dk>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

Conflict:
  fs/namei.c
Signed-off-by: NLi Lingfeng <lilingfeng3@huawei.com>
Reviewed-by: NZhang Yi <yi.zhang@huawei.com>
Reviewed-by: NWang Weiyang <wangweiyang2@huawei.com>
Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com>

e489dea3

Revert "io_uring: fix soft lockup when call __io_remove_buffers" · cd0d1c7e

由 Li Lingfeng 提交于 2月 28, 2023

Offering: HULK
hulk inclusion
category: feature
bugzilla: https://gitee.com/src-openeuler/kernel/issues/I6BTWC

-------------------------------

This reverts commit 4222bec0.

We need to apply patch 788d0824269bef (io_uring: import 5.15-stable
io_uring) to move io_uring to separate directory and solve
the problem of CVE-2023-0240.
This patch can be reverted since patch 788d0824269bef contains it.
Signed-off-by: NLi Lingfeng <lilingfeng3@huawei.com>
Reviewed-by: NZhang Yi <yi.zhang@huawei.com>
Reviewed-by: NWang Weiyang <wangweiyang2@huawei.com>
Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com>

cd0d1c7e

Revert "io_uring: deduplicate failing task_work_add" · de4d9efe

由 Li Lingfeng 提交于 2月 28, 2023

Offering: HULK
hulk inclusion
category: feature
bugzilla: https://gitee.com/src-openeuler/kernel/issues/I6BTWC

-------------------------------

This reverts commit 62ca1710.

This patch extracts a function for patch 792bb6eb (io_uring: don't
take uring_lock during iowq cancel). We can revert it since patch
792bb6eb has been replaced by the one from stable/5.10.
Signed-off-by: NLi Lingfeng <lilingfeng3@huawei.com>
Reviewed-by: NZhang Yi <yi.zhang@huawei.com>
Reviewed-by: NWang Weiyang <wangweiyang2@huawei.com>
Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com>

de4d9efe

Revert "io_uring: don't take uring_lock during iowq cancel" · 0839d957

由 Li Lingfeng 提交于 2月 28, 2023

Offering: HULK
hulk inclusion
category: feature
bugzilla: https://gitee.com/src-openeuler/kernel/issues/I6BTWC

-------------------------------

This reverts commit c5562a26.

We need to apply patch 788d0824269bef (io_uring: import 5.15-stable
io_uring) to move io_uring to separate directory and solve
the problem of CVE-2023-0240.
This patch can be replaced by the same one from stable/5.10 to
eliminate conflicts.
Signed-off-by: NLi Lingfeng <lilingfeng3@huawei.com>
Reviewed-by: NZhang Yi <yi.zhang@huawei.com>
Reviewed-by: NWang Weiyang <wangweiyang2@huawei.com>
Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com>

0839d957

Revert "[Backport] io_uring: don't keep looping for more events if we can't flush overflow" · 8d894ade

由 Li Lingfeng 提交于 2月 28, 2023

Offering: HULK
hulk inclusion
category: feature
bugzilla: https://gitee.com/src-openeuler/kernel/issues/I6BTWC

-------------------------------

This reverts commit da4cb346.

We need to apply patch 788d0824269bef (io_uring: import 5.15-stable
io_uring) to move io_uring to separate directory and solve
the problem of CVE-2023-0240.
This patch can be reverted since patch 788d0824269bef contains it.
Signed-off-by: NLi Lingfeng <lilingfeng3@huawei.com>
Reviewed-by: NZhang Yi <yi.zhang@huawei.com>
Reviewed-by: NWang Weiyang <wangweiyang2@huawei.com>
Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com>

8d894ade

Revert "[Huawei] io-wq: Switch io_wqe_worker's fs before releasing request" · 2334b1a9

由 Li Lingfeng 提交于 2月 28, 2023

Offering: HULK
hulk inclusion
category: feature
bugzilla: https://gitee.com/src-openeuler/kernel/issues/I6BTWC

-------------------------------

This reverts commit 00bb60b9.

We need to apply patch 788d0824269bef (io_uring: import 5.15-stable
io_uring) to move io_uring to separate directory and solve
the problem of CVE-2023-0240.
Revert this patch and add it again after patch 788d0824269bef to
reduce conflicts.
Signed-off-by: NLi Lingfeng <lilingfeng3@huawei.com>
Reviewed-by: NZhang Yi <yi.zhang@huawei.com>
Reviewed-by: NWang Weiyang <wangweiyang2@huawei.com>
Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com>

2334b1a9

Revert "[Huawei] io_uring: fix soft lockup in io_submit_sqes()" · cb0861ae

由 Li Lingfeng 提交于 2月 28, 2023

Offering: HULK
hulk inclusion
category: feature
bugzilla: https://gitee.com/src-openeuler/kernel/issues/I6BTWC

-------------------------------

This reverts commit ac02c0bf.

We need to apply patch 788d0824269bef (io_uring: import 5.15-stable
io_uring) to move io_uring to separate directory and solve
the problem of CVE-2023-0240.
Revert this patch and add it again after patch 788d0824269bef to
reduce conflicts.
Signed-off-by: NLi Lingfeng <lilingfeng3@huawei.com>
Reviewed-by: NZhang Yi <yi.zhang@huawei.com>
Reviewed-by: NWang Weiyang <wangweiyang2@huawei.com>
Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com>

cb0861ae

Revert "[Huawei] io_uring:drop identity before creating a private one" · eb278e9b

由 Li Lingfeng 提交于 2月 28, 2023

Offering: HULK
hulk inclusion
category: feature
bugzilla: https://gitee.com/src-openeuler/kernel/issues/I6BTWC

-------------------------------

This reverts commit 6d9aaec1.

We need to apply patch 788d0824269bef (io_uring: import 5.15-stable
io_uring) to move io_uring to separate directory and solve
the problem of CVE-2023-0240.
This patch fix a uaf problem of io_identity, and it can be reverted
since io_identity is removed in patch 788d0824269bef.
Signed-off-by: NLi Lingfeng <lilingfeng3@huawei.com>
Reviewed-by: NZhang Yi <yi.zhang@huawei.com>
Reviewed-by: NWang Weiyang <wangweiyang2@huawei.com>
Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com>

eb278e9b

27 2月, 2023 1 次提交

!332 [5.10]Make Multiple functions On Netswift PCIE NIC belong to different IOMMU group · 443c6146

由 openeuler-ci-bot 提交于 2月 27, 2023

Merge Pull Request from: @duanqiangwen 
 
This PR is to workaround if multiple functions on a Beijing Wangxun PCl device belong to the same lOMMU group,
    they can be directly assigned to only one VM as well, letting multiple functions
    belong to different IOMMU group.

Issue：https://gitee.com/openeuler/kernel/issues/I66W4Y

Hardware List:
Netswift All Nic, PCI vendor ID 0x8088

Net-Swift Official Website:
https://www.net-swift.com 
 
Link:https://gitee.com/openeuler/kernel/pulls/332 

Reviewed-by: Jialin Zhang <zhangjialin11@huawei.com> 
Reviewed-by: Zheng Zengkai <zhengzengkai@huawei.com> 
Signed-off-by: Zheng Zengkai <zhengzengkai@huawei.com>

443c6146

25 2月, 2023 7 次提交

!414 Backport CVEs and bugfixes · 6d789c0c

由 openeuler-ci-bot 提交于 2月 25, 2023

Merge Pull Request from: @zhangjialin11 
 
Pull new CVEs:
CVE-2023-0597
CVE-2023-0615

Huawei BMA bugfix from Huajingjing
mm bugfixes from Lu Jialin and Zhang Peng
vfio bugfixes from Kunkun Jiang
net bugfixes from Baisong Zhong, Liu Jian, Ziyang Xuan and Zhengchao Shao
arm32 kaslr bugfix from Cui GaoSheng
fs bugfix from ZhaoLong Wang 
 
Link:https://gitee.com/openeuler/kernel/pulls/414 

Reviewed-by: Zheng Zengkai <zhengzengkai@huawei.com> 
Signed-off-by: Zheng Zengkai <zhengzengkai@huawei.com>

6d789c0c

x86/kasan: Populate shadow for shared chunk of the CPU entry area · c689e357

由 Sean Christopherson 提交于 2月 22, 2023

mainline inclusion
from mainline-v6.2-rc1
commit 1cfaac24
category: bugfix
bugzilla: https://gitee.com/src-openeuler/kernel/issues/I6C6UC
CVE: CVE-2023-0597

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=1cfaac2400c73378e78182a706be0f3ac8b93cd7

--------------------------------

Popuplate the shadow for the shared portion of the CPU entry area, i.e.
the read-only IDT mapping, during KASAN initialization.  A recent change
modified KASAN to map the per-CPU areas on-demand, but forgot to keep a
shadow for the common area that is shared amongst all CPUs.

Map the common area in KASAN init instead of letting idt_map_in_cea() do
the dirty work so that it Just Works in the unlikely event more shared
data is shoved into the CPU entry area.

The bug manifests as a not-present #PF when software attempts to lookup
an IDT entry, e.g. when KVM is handling IRQs on Intel CPUs (KVM performs
direct CALL to the IRQ handler to avoid the overhead of INTn):

 BUG: unable to handle page fault for address: fffffbc0000001d8
 #PF: supervisor read access in kernel mode
 #PF: error_code(0x0000) - not-present page
 PGD 16c03a067 P4D 16c03a067 PUD 0
 Oops: 0000 [#1] PREEMPT SMP KASAN
 CPU: 5 PID: 901 Comm: repro Tainted: G        W          6.1.0-rc3+ #410
 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015
 RIP: 0010:kasan_check_range+0xdf/0x190
  vmx_handle_exit_irqoff+0x152/0x290 [kvm_intel]
  vcpu_run+0x1d89/0x2bd0 [kvm]
  kvm_arch_vcpu_ioctl_run+0x3ce/0xa70 [kvm]
  kvm_vcpu_ioctl+0x349/0x900 [kvm]
  __x64_sys_ioctl+0xb8/0xf0
  do_syscall_64+0x2b/0x50
  entry_SYSCALL_64_after_hwframe+0x46/0xb0

Fixes: 9fd429c28073 ("x86/kasan: Map shadow for percpu pages on demand")
Reported-by: syzbot+8cdd16fd5a6c0565e227@syzkaller.appspotmail.com
Signed-off-by: NSean Christopherson <seanjc@google.com>
Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20221110203504.1985010-6-seanjc@google.comSigned-off-by: NTong Tiangen <tongtiangen@huawei.com>
Reviewed-by: NWang Weiyang <wangweiyang2@huawei.com>
Reviewed-by: NKefeng Wang <wangkefeng.wang@huawei.com>
Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com>

c689e357

x86/kasan: Add helpers to align shadow addresses up and down · 7ba149ff

由 Sean Christopherson 提交于 2月 22, 2023

mainline inclusion
from mainline-v6.2-rc1
commit bde258d9
category: bugfix
bugzilla: https://gitee.com/src-openeuler/kernel/issues/I6C6UC
CVE: CVE-2023-0597

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=bde258d97409f2a45243cb393a55ea9ecfc7aba5

--------------------------------

Add helpers to dedup code for aligning shadow address up/down to page
boundaries when translating an address to its shadow.

No functional change intended.
Signed-off-by: NSean Christopherson <seanjc@google.com>
Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: NAndrey Ryabinin <ryabinin.a.a@gmail.com>
Link: https://lkml.kernel.org/r/20221110203504.1985010-5-seanjc@google.comSigned-off-by: NTong Tiangen <tongtiangen@huawei.com>
Reviewed-by: NWang Weiyang <wangweiyang2@huawei.com>
Reviewed-by: NKefeng Wang <wangkefeng.wang@huawei.com>
Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com>

7ba149ff

x86/kasan: Rename local CPU_ENTRY_AREA variables to shorten names · d8e31b4b

由 Sean Christopherson 提交于 2月 22, 2023

mainline inclusion
from mainline-v6.2-rc1
commit 7077d2cc
category: bugfix
bugzilla: https://gitee.com/src-openeuler/kernel/issues/I6C6UC
CVE: CVE-2023-0597

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=7077d2ccb94dafd00b29cc2d601c9f6891648f5b

--------------------------------

Rename the CPU entry area variables in kasan_init() to shorten their
names, a future fix will reference the beginning of the per-CPU portion
of the CPU entry area, and shadow_cpu_entry_per_cpu_begin is a bit much.

No functional change intended.
Signed-off-by: NSean Christopherson <seanjc@google.com>
Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: NAndrey Ryabinin <ryabinin.a.a@gmail.com>
Link: https://lkml.kernel.org/r/20221110203504.1985010-4-seanjc@google.comSigned-off-by: NTong Tiangen <tongtiangen@huawei.com>
Reviewed-by: NWang Weiyang <wangweiyang2@huawei.com>
Reviewed-by: NKefeng Wang <wangkefeng.wang@huawei.com>
Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com>

d8e31b4b

x86/mm: Populate KASAN shadow for entire per-CPU range of CPU entry area · 8706d8f9

由 Sean Christopherson 提交于 2月 22, 2023

mainline inclusion
from mainline-v6.2-rc1
commit 97650148
category: bugfix
bugzilla: https://gitee.com/src-openeuler/kernel/issues/I6C6UC
CVE: CVE-2023-0597

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=97650148a15e0b30099d6175ffe278b9f55ec66a

--------------------------------

Populate a KASAN shadow for the entire possible per-CPU range of the CPU
entry area instead of requiring that each individual chunk map a shadow.
Mapping shadows individually is error prone, e.g. the per-CPU GDT mapping
was left behind, which can lead to not-present page faults during KASAN
validation if the kernel performs a software lookup into the GDT.  The DS
buffer is also likely affected.

The motivation for mapping the per-CPU areas on-demand was to avoid
mapping the entire 512GiB range that's reserved for the CPU entry area,
shaving a few bytes by not creating shadows for potentially unused memory
was not a goal.

The bug is most easily reproduced by doing a sigreturn with a garbage
CS in the sigcontext, e.g.

  int main(void)
  {
    struct sigcontext regs;

    syscall(__NR_mmap, 0x1ffff000ul, 0x1000ul, 0ul, 0x32ul, -1, 0ul);
    syscall(__NR_mmap, 0x20000000ul, 0x1000000ul, 7ul, 0x32ul, -1, 0ul);
    syscall(__NR_mmap, 0x21000000ul, 0x1000ul, 0ul, 0x32ul, -1, 0ul);

    memset(&regs, 0, sizeof(regs));
    regs.cs = 0x1d0;
    syscall(__NR_rt_sigreturn);
    return 0;
  }

to coerce the kernel into doing a GDT lookup to compute CS.base when
reading the instruction bytes on the subsequent #GP to determine whether
or not the #GP is something the kernel should handle, e.g. to fixup UMIP
violations or to emulate CLI/STI for IOPL=3 applications.

  BUG: unable to handle page fault for address: fffffbc8379ace00
  #PF: supervisor read access in kernel mode
  #PF: error_code(0x0000) - not-present page
  PGD 16c03a067 P4D 16c03a067 PUD 15b990067 PMD 15b98f067 PTE 0
  Oops: 0000 [#1] PREEMPT SMP KASAN
  CPU: 3 PID: 851 Comm: r2 Not tainted 6.1.0-rc3-next-20221103+ #432
  Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015
  RIP: 0010:kasan_check_range+0xdf/0x190
  Call Trace:
   <TASK>
   get_desc+0xb0/0x1d0
   insn_get_seg_base+0x104/0x270
   insn_fetch_from_user+0x66/0x80
   fixup_umip_exception+0xb1/0x530
   exc_general_protection+0x181/0x210
   asm_exc_general_protection+0x22/0x30
  RIP: 0003:0x0
  Code: Unable to access opcode bytes at 0xffffffffffffffd6.
  RSP: 0003:0000000000000000 EFLAGS: 00000202
  RAX: 0000000000000000 RBX: 0000000000000000 RCX: 00000000000001d0
  RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
  RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
  R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
  R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
   </TASK>

Fixes: 9fd429c28073 ("x86/kasan: Map shadow for percpu pages on demand")
Reported-by: syzbot+ffb4f000dc2872c93f62@syzkaller.appspotmail.com
Suggested-by: NAndrey Ryabinin <ryabinin.a.a@gmail.com>
Signed-off-by: NSean Christopherson <seanjc@google.com>
Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: NAndrey Ryabinin <ryabinin.a.a@gmail.com>
Link: https://lkml.kernel.org/r/20221110203504.1985010-3-seanjc@google.comSigned-off-by: NTong Tiangen <tongtiangen@huawei.com>
Reviewed-by: NWang Weiyang <wangweiyang2@huawei.com>
Reviewed-by: NKefeng Wang <wangkefeng.wang@huawei.com>
Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com>

8706d8f9

x86/mm: Recompute physical address for every page of per-CPU CEA mapping · 5949a893

由 Sean Christopherson 提交于 2月 22, 2023

mainline inclusion
from mainline-v6.2-rc1
commit 80d72a8f
category: bugfix
bugzilla: https://gitee.com/src-openeuler/kernel/issues/I6C6UC
CVE: CVE-2023-0597

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=80d72a8f76e8f3f0b5a70b8c7022578e17bde8e7

--------------------------------

Recompute the physical address for each per-CPU page in the CPU entry
area, a recent commit inadvertantly modified cea_map_percpu_pages() such
that every PTE is mapped to the physical address of the first page.

Fixes: 9fd429c28073 ("x86/kasan: Map shadow for percpu pages on demand")
Signed-off-by: NSean Christopherson <seanjc@google.com>
Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: NAndrey Ryabinin <ryabinin.a.a@gmail.com>
Link: https://lkml.kernel.org/r/20221110203504.1985010-2-seanjc@google.comSigned-off-by: NTong Tiangen <tongtiangen@huawei.com>
Reviewed-by: NWang Weiyang <wangweiyang2@huawei.com>
Reviewed-by: NKefeng Wang <wangkefeng.wang@huawei.com>
Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com>

5949a893

x86/kasan: Map shadow for percpu pages on demand · beb564bc

由 Andrey Ryabinin 提交于 2月 22, 2023

mainline inclusion
from mainline-v6.2-rc1
commit 3f148f33
category: bugfix
bugzilla: https://gitee.com/src-openeuler/kernel/issues/I6C6UC
CVE: CVE-2023-0597

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=3f148f3318140035e87decc1214795ff0755757b

--------------------------------

KASAN maps shadow for the entire CPU-entry-area:
  [CPU_ENTRY_AREA_BASE, CPU_ENTRY_AREA_BASE + CPU_ENTRY_AREA_MAP_SIZE]

This will explode once the per-cpu entry areas are randomized since it
will increase CPU_ENTRY_AREA_MAP_SIZE to 512 GB and KASAN fails to
allocate shadow for such big area.

Fix this by allocating KASAN shadow only for really used cpu entry area
addresses mapped by cea_map_percpu_pages()

Thanks to the 0day folks for finding and reporting this to be an issue.

[ dhansen: tweak changelog since this will get committed before peterz's
	   actual cpu-entry-area randomization ]
Signed-off-by: NAndrey Ryabinin <ryabinin.a.a@gmail.com>
Signed-off-by: NDave Hansen <dave.hansen@linux.intel.com>
Tested-by: NYujie Liu <yujie.liu@intel.com>
Cc: kernel test robot <yujie.liu@intel.com>
Link: https://lore.kernel.org/r/202210241508.2e203c3d-yujie.liu@intel.comSigned-off-by: NTong Tiangen <tongtiangen@huawei.com>
Reviewed-by: NWang Weiyang <wangweiyang2@huawei.com>
Reviewed-by: NKefeng Wang <wangkefeng.wang@huawei.com>
Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com>

beb564bc

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功