提交 · 1446e1df9eb183fdf81c3f0715402f1d7595d4cb · openeuler / Kernel

02 12月, 2020 2 次提交

kernel: Implement selective syscall userspace redirection · 1446e1df

由 Gabriel Krisman Bertazi 提交于 11月 27, 2020

Introduce a mechanism to quickly disable/enable syscall handling for a
specific process and redirect to userspace via SIGSYS.  This is useful
for processes with parts that require syscall redirection and parts that
don't, but who need to perform this boundary crossing really fast,
without paying the cost of a system call to reconfigure syscall handling
on each boundary transition.  This is particularly important for Windows
games running over Wine.

The proposed interface looks like this:

  prctl(PR_SET_SYSCALL_USER_DISPATCH, <op>, <off>, <length>, [selector])

The range [<offset>,<offset>+<length>) is a part of the process memory
map that is allowed to by-pass the redirection code and dispatch
syscalls directly, such that in fast paths a process doesn't need to
disable the trap nor the kernel has to check the selector.  This is
essential to return from SIGSYS to a blocked area without triggering
another SIGSYS from rt_sigreturn.

selector is an optional pointer to a char-sized userspace memory region
that has a key switch for the mechanism. This key switch is set to
either PR_SYS_DISPATCH_ON, PR_SYS_DISPATCH_OFF to enable and disable the
redirection without calling the kernel.

The feature is meant to be set per-thread and it is disabled on
fork/clone/execv.

Internally, this doesn't add overhead to the syscall hot path, and it
requires very little per-architecture support.  I avoided using seccomp,
even though it duplicates some functionality, due to previous feedback
that maybe it shouldn't mix with seccomp since it is not a security
mechanism.  And obviously, this should never be considered a security
mechanism, since any part of the program can by-pass it by using the
syscall dispatcher.

For the sysinfo benchmark, which measures the overhead added to
executing a native syscall that doesn't require interception, the
overhead using only the direct dispatcher region to issue syscalls is
pretty much irrelevant.  The overhead of using the selector goes around
40ns for a native (unredirected) syscall in my system, and it is (as
expected) dominated by the supervisor-mode user-address access.  In
fact, with SMAP off, the overhead is consistently less than 5ns on my
test box.
Signed-off-by: NGabriel Krisman Bertazi <krisman@collabora.com>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Reviewed-by: NAndy Lutomirski <luto@kernel.org>
Acked-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: NKees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20201127193238.821364-4-krisman@collabora.com

1446e1df

signal: Expose SYS_USER_DISPATCH si_code type · 1d7637d8

由 Gabriel Krisman Bertazi 提交于 11月 27, 2020

SYS_USER_DISPATCH will be triggered when a syscall is sent to userspace
by the Syscall User Dispatch mechanism.  This adjusts eventual
BUILD_BUG_ON around the tree.
Signed-off-by: NGabriel Krisman Bertazi <krisman@collabora.com>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Reviewed-by: NAndy Lutomirski <luto@kernel.org>
Acked-by: NKees Cook <keescook@chromium.org>
Acked-by: NChristian Brauner <christian.brauner@ubuntu.com>
Acked-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/20201127193238.821364-3-krisman@collabora.com

1d7637d8

25 11月, 2020 1 次提交

entry: Fix boot for !CONFIG_GENERIC_ENTRY · 5903f61e

由 Gabriel Krisman Bertazi 提交于 11月 23, 2020

A copy-pasta mistake tries to set SYSCALL_WORK flags instead of TIF
flags for !CONFIG_GENERIC_ENTRY.  Also, add safeguards to catch this at
compilation time.

Fixes: 3136b93c ("entry: Expose helpers to migrate TIF to SYSCALL_WORK flags")
Reported-by: NNaresh Kamboju <naresh.kamboju@linaro.org>
Suggested-by: NJann Horn <jannh@google.com>
Signed-off-by: NGabriel Krisman Bertazi <krisman@collabora.com>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Reviewed-by: NKees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/87a6v8qd9p.fsf_-_@collabora.com

5903f61e

19 11月, 2020 1 次提交

context_tracking: Don't implement exception_enter/exit() on CONFIG_HAVE_CONTEXT_TRACKING_OFFSTACK · 179a9cf7

由 Frederic Weisbecker 提交于 11月 17, 2020

The typical steps with context tracking are:

1) Task runs in userspace
2) Task enters the kernel (syscall/exception/IRQ)
3) Task switches from context tracking state CONTEXT_USER to
   CONTEXT_KERNEL (user_exit())
4) Task does stuff in kernel
5) Task switches from context tracking state CONTEXT_KERNEL to
   CONTEXT_USER (user_enter())
6) Task exits the kernel

If an exception fires between 5) and 6), the pt_regs and the context
tracking disagree on the context of the faulted/trapped instruction.
CONTEXT_KERNEL must be set before the exception handler, that's
unconditional for those handlers that want to be able to call into
schedule(), but CONTEXT_USER must be restored when the exception exits
whereas pt_regs tells that we are resuming to kernel space.

This can't be fixed with storing the context tracking state in a per-cpu
or per-task variable since another exception may fire onto the current
one and overwrite the saved state. Also the task can schedule. So it
has to be stored in a per task stack.

This is how exception_enter()/exception_exit() paper over the problem:

5) Task switches from context tracking state CONTEXT_KERNEL to
   CONTEXT_USER (user_enter())
5.1) Exception fires
5.2) prev_state = exception_enter() // save CONTEXT_USER to prev_state
                                    // and set CONTEXT_KERNEL
5.3) Exception handler
5.4) exception_enter(prev_state) // restore CONTEXT_USER
5.5) Exception resumes
6) Task exits the kernel

The condition to live without exception_enter()/exception_exit() is to
forbid exceptions and IRQs between 2) and 3) and between 5) and 6), or if
any is allowed to trigger, it won't call into context tracking, eg: NMIs,
and it won't schedule. These requirements are met by architectures
supporting CONFIG_HAVE_CONTEXT_TRACKING_OFFSTACK and those can
therefore afford not to implement this hack.
Signed-off-by: NFrederic Weisbecker <frederic@kernel.org>
Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20201117151637.259084-3-frederic@kernel.org

179a9cf7

17 11月, 2020 8 次提交

entry: Drop usage of TIF flags in the generic syscall code · 29915524

由 Gabriel Krisman Bertazi 提交于 11月 16, 2020

Now that the flags migration in the common syscall entry code is complete
and the code relies exclusively on thread_info::syscall_work, clean up the
accesses to TI flags in that path.
Signed-off-by: NGabriel Krisman Bertazi <krisman@collabora.com>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Reviewed-by: NAndy Lutomirski <luto@kernel.org>
Link: https://lore.kernel.org/r/20201116174206.2639648-10-krisman@collabora.com

29915524

audit: Migrate to use SYSCALL_WORK flag · 785dc4eb

由 Gabriel Krisman Bertazi 提交于 11月 16, 2020

On architectures using the generic syscall entry code the architecture
independent syscall work is moved to flags in thread_info::syscall_work.
This removes architecture dependencies and frees up TIF bits.

Define SYSCALL_WORK_SYSCALL_AUDIT, use it in the generic entry code and
convert the code which uses the TIF specific helper functions to use the
new *_syscall_work() helpers which either resolve to the new mode for users
of the generic entry code or to the TIF based functions for the other
architectures.
Signed-off-by: NGabriel Krisman Bertazi <krisman@collabora.com>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Reviewed-by: NAndy Lutomirski <luto@kernel.org>
Link: https://lore.kernel.org/r/20201116174206.2639648-9-krisman@collabora.com

785dc4eb

ptrace: Migrate TIF_SYSCALL_EMU to use SYSCALL_WORK flag · 64eb35f7

由 Gabriel Krisman Bertazi 提交于 11月 16, 2020

Define SYSCALL_WORK_SYSCALL_EMU, use it in the generic entry code and
convert the code which uses the TIF specific helper functions to use the
new *_syscall_work() helpers which either resolve to the new mode for users
of the generic entry code or to the TIF based functions for the other
architectures.
Signed-off-by: NGabriel Krisman Bertazi <krisman@collabora.com>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Reviewed-by: NAndy Lutomirski <luto@kernel.org>
Link: https://lore.kernel.org/r/20201116174206.2639648-8-krisman@collabora.com

64eb35f7

ptrace: Migrate to use SYSCALL_TRACE flag · 64c19ba2

由 Gabriel Krisman Bertazi 提交于 11月 16, 2020

Define SYSCALL_WORK_SYSCALL_TRACE, use it in the generic entry code and
convert the code which uses the TIF specific helper functions to use the
new *_syscall_work() helpers which either resolve to the new mode for users
of the generic entry code or to the TIF based functions for the other
architectures.
Signed-off-by: NGabriel Krisman Bertazi <krisman@collabora.com>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Reviewed-by: NAndy Lutomirski <luto@kernel.org>
Link: https://lore.kernel.org/r/20201116174206.2639648-7-krisman@collabora.com

64c19ba2

tracepoints: Migrate to use SYSCALL_WORK flag · 524666cb

由 Gabriel Krisman Bertazi 提交于 11月 16, 2020

Define SYSCALL_WORK_SYSCALL_TRACEPOINT, use it in the generic entry code
and convert the code which uses the TIF specific helper functions to use
the new *_syscall_work() helpers which either resolve to the new mode for
users of the generic entry code or to the TIF based functions for the other
architectures.
Signed-off-by: NGabriel Krisman Bertazi <krisman@collabora.com>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Reviewed-by: NAndy Lutomirski <luto@kernel.org>
Link: https://lore.kernel.org/r/20201116174206.2639648-6-krisman@collabora.com

524666cb

seccomp: Migrate to use SYSCALL_WORK flag · 23d67a54

由 Gabriel Krisman Bertazi 提交于 11月 16, 2020

On architectures using the generic syscall entry code the architecture
independent syscall work is moved to flags in thread_info::syscall_work.
This removes architecture dependencies and frees up TIF bits.

Define SYSCALL_WORK_SECCOMP, use it in the generic entry code and convert
the code which uses the TIF specific helper functions to use the new
*_syscall_work() helpers which either resolve to the new mode for users of
the generic entry code or to the TIF based functions for the other
architectures.
Signed-off-by: NGabriel Krisman Bertazi <krisman@collabora.com>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Reviewed-by: NAndy Lutomirski <luto@kernel.org>
Link: https://lore.kernel.org/r/20201116174206.2639648-5-krisman@collabora.com

23d67a54

entry: Wire up syscall_work in common entry code · b86678cf

由 Gabriel Krisman Bertazi 提交于 11月 16, 2020

Prepare the common entry code to use the SYSCALL_WORK flags. They will
be defined in subsequent patches for each type of syscall
work. SYSCALL_WORK_ENTRY/EXIT are defined for the transition, as they
will replace the TIF_ equivalent defines.
Signed-off-by: NGabriel Krisman Bertazi <krisman@collabora.com>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Reviewed-by: NAndy Lutomirski <luto@kernel.org>
Link: https://lore.kernel.org/r/20201116174206.2639648-4-krisman@collabora.com

b86678cf

entry: Expose helpers to migrate TIF to SYSCALL_WORK flags · 3136b93c

由 Gabriel Krisman Bertazi 提交于 11月 16, 2020

With the goal to split the syscall work related flags into a separate
field that is architecture independent, expose transitional helpers that
resolve to either the TIF flags or to the corresponding SYSCALL_WORK
flags. This will allow architectures to migrate only when they port to
the generic syscall entry code.
Signed-off-by: NGabriel Krisman Bertazi <krisman@collabora.com>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Reviewed-by: NAndy Lutomirski <luto@kernel.org>
Link: https://lore.kernel.org/r/20201116174206.2639648-3-krisman@collabora.com

3136b93c

16 11月, 2020 1 次提交

entry: Fix spelling/typo errors in irq entry code · 78a56e04

由 Ira Weiny 提交于 11月 04, 2020

s/reguired/required/
s/Interupts/Interrupts/
s/quiescient/quiescent/
s/assemenbly/assembly/
Signed-off-by: NIra Weiny <ira.weiny@intel.com>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20201104230157.3378023-1-ira.weiny@intel.com

78a56e04

05 11月, 2020 1 次提交

x86/entry: Move nmi entry/exit into common code · b6be002b

由 Thomas Gleixner 提交于 11月 02, 2020

Lockdep state handling on NMI enter and exit is nothing specific to X86. It's
not any different on other architectures. Also the extra state type is not
necessary, irqentry_state_t can carry the necessary information as well.

Move it to common code and extend irqentry_state_t to carry lockdep state.

[ Ira: Make exit_rcu and lockdep a union as they are mutually exclusive
  between the IRQ and NMI exceptions, and add kernel documentation for
  struct irqentry_state_t ]
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Signed-off-by: NIra Weiny <ira.weiny@intel.com>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20201102205320.1458656-7-ira.weiny@intel.com

b6be002b

31 10月, 2020 1 次提交

net/mlx5: Replace zero-length array with flexible-array member · 29056207

由 Gustavo A. R. Silva 提交于 10月 27, 2020

There is a regular need in the kernel to provide a way to declare having a
dynamically sized set of trailing elements in a structure. Kernel code should
always use “flexible array members”[1] for these cases. The older style of
one-element or zero-length arrays should no longer be used[2].

[1] https://en.wikipedia.org/wiki/Flexible_array_member
[2] https://www.kernel.org/doc/html/v5.9/process/deprecated.html#zero-length-and-one-element-arraysSigned-off-by: NGustavo A. R. Silva <gustavoars@kernel.org>

29056207

30 10月, 2020 7 次提交

debugfs: remove return value of debugfs_create_devm_seqfile() · 0d519cbf

由 Greg Kroah-Hartman 提交于 10月 23, 2020

No one checks the return value of debugfs_create_devm_seqfile(), as it's
not needed, so make the return value void, so that no one tries to do so
in the future.

Link: https://lore.kernel.org/r/20201023131037.2500765-1-gregkh@linuxfoundation.orgSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

0d519cbf

fs: Replace zero-length array with flexible-array member · 5e01fdff

由 Gustavo A. R. Silva 提交于 8月 31, 2020

There is a regular need in the kernel to provide a way to declare having a
dynamically sized set of trailing elements in a structure. Kernel code should
always use “flexible array members”[1] for these cases. The older style of
one-element or zero-length arrays should no longer be used[2].

[1] https://en.wikipedia.org/wiki/Flexible_array_member
[2] https://www.kernel.org/doc/html/v5.9-rc1/process/deprecated.html#zero-length-and-one-element-arraysSigned-off-by: NGustavo A. R. Silva <gustavoars@kernel.org>

5e01fdff

platform/chrome: cros_ec_proto: Replace zero-length array with flexible-array member · 12008883

由 Gustavo A. R. Silva 提交于 8月 31, 2020

There is a regular need in the kernel to provide a way to declare having a
dynamically sized set of trailing elements in a structure. Kernel code should
always use “flexible array members”[1] for these cases. The older style of
one-element or zero-length arrays should no longer be used[2].

[1] https://en.wikipedia.org/wiki/Flexible_array_member
[2] https://www.kernel.org/doc/html/v5.9-rc1/process/deprecated.html#zero-length-and-one-element-arraysSigned-off-by: NGustavo A. R. Silva <gustavoars@kernel.org>

12008883

platform/chrome: cros_ec_commands: Replace zero-length array with flexible-array member · 88354105

由 Gustavo A. R. Silva 提交于 8月 31, 2020

There is a regular need in the kernel to provide a way to declare having a
dynamically sized set of trailing elements in a structure. Kernel code should
always use “flexible array members”[1] for these cases. The older style of
one-element or zero-length arrays should no longer be used[2].

[1] https://en.wikipedia.org/wiki/Flexible_array_member
[2] https://www.kernel.org/doc/html/v5.9-rc1/process/deprecated.html#zero-length-and-one-element-arraysSigned-off-by: NGustavo A. R. Silva <gustavoars@kernel.org>

88354105

mailbox: zynqmp-ipi-message: Replace zero-length array with flexible-array member · 277ffd6c

由 Gustavo A. R. Silva 提交于 8月 31, 2020

There is a regular need in the kernel to provide a way to declare having a
dynamically sized set of trailing elements in a structure. Kernel code should
always use “flexible array members”[1] for these cases. The older style of
one-element or zero-length arrays should no longer be used[2].

[1] https://en.wikipedia.org/wiki/Flexible_array_member
[2] https://www.kernel.org/doc/html/v5.9-rc1/process/deprecated.html#zero-length-and-one-element-arraysSigned-off-by: NGustavo A. R. Silva <gustavoars@kernel.org>

277ffd6c

dmaengine: ti-cppi5: Replace zero-length array with flexible-array member · a4147d85

由 Gustavo A. R. Silva 提交于 8月 31, 2020

There is a regular need in the kernel to provide a way to declare having a
dynamically sized set of trailing elements in a structure. Kernel code should
always use “flexible array members”[1] for these cases. The older style of
one-element or zero-length arrays should no longer be used[2].

[1] https://en.wikipedia.org/wiki/Flexible_array_member
[2] https://www.kernel.org/doc/html/v5.9-rc1/process/deprecated.html#zero-length-and-one-element-arraysSigned-off-by: NGustavo A. R. Silva <gustavoars@kernel.org>

a4147d85

include: jhash/signal: Fix fall-through warnings for Clang · 4169e889

由 Gustavo A. R. Silva 提交于 9月 02, 2020

In preparation to enable -Wimplicit-fallthrough for Clang, explicitly
add break statements instead of letting the code fall through to the
next case.

This patch adds four break statements that, together, fix almost 40,000
warnings when building Linux 5.10-rc1 with Clang 12.0.0 and this[1] change
reverted. Notice that in order to enable -Wimplicit-fallthrough for Clang,
such change[1] is meant to be reverted at some point. So, this patch helps
to move in that direction.

Something important to mention is that there is currently a discrepancy
between GCC and Clang when dealing with switch fall-through to empty case
statements or to cases that only contain a break/continue/return
statement[2][3][4].

Now that the -Wimplicit-fallthrough option has been globally enabled[5],
any compiler should really warn on missing either a fallthrough annotation
or any of the other case-terminating statements (break/continue/return/
goto) when falling through to the next case statement. Making exceptions
to this introduces variation in case handling which may continue to lead
to bugs, misunderstandings, and a general lack of robustness. The point
of enabling options like -Wimplicit-fallthrough is to prevent human error
and aid developers in spotting bugs before their code is even built/
submitted/committed, therefore eliminating classes of bugs. So, in order
to really accomplish this, we should, and can, move in the direction of
addressing any error-prone scenarios and get rid of the unintentional
fallthrough bug-class in the kernel, entirely, even if there is some minor
redundancy. Better to have explicit case-ending statements than continue to
have exceptions where one must guess as to the right result. The compiler
will eliminate any actual redundancy.

[1] commit e2079e93 ("kbuild: Do not enable -Wimplicit-fallthrough for clang for now")
[2] https://github.com/ClangBuiltLinux/linux/issues/636
[3] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91432
[4] https://godbolt.org/z/xgkvIh
[5] commit a035d552 ("Makefile: Globally enable fall-through warning")
Co-developed-by: NKees Cook <keescook@chromium.org>
Signed-off-by: NKees Cook <keescook@chromium.org>
Signed-off-by: NGustavo A. R. Silva <gustavoars@kernel.org>

4169e889

29 10月, 2020 8 次提交

afs: Fix afs_invalidatepage to adjust the dirty region · f86726a6

由 David Howells 提交于 10月 22, 2020

Fix afs_invalidatepage() to adjust the dirty region recorded in
page->private when truncating a page. If the dirty region is entirely
removed, then the private data is cleared and the page dirty state is
cleared.

Without this, if the page is truncated and then expanded again by truncate,
zeros from the expanded, but no-longer dirty region may get written back to
the server if the page gets laundered due to a conflicting 3rd-party write.

It mustn't, however, shorten the dirty region of the page if that page is
still mmapped and has been marked dirty by afs_page_mkwrite(), so a flag is
stored in page->private to record this.

Fixes: 4343d008 ("afs: Get rid of the afs_writeback record")
Signed-off-by: NDavid Howells <dhowells@redhat.com>

f86726a6

afs: Wrap page->private manipulations in inline functions · 185f0c70

由 David Howells 提交于 10月 26, 2020

The afs filesystem uses page->private to store the dirty range within a
page such that in the event of a conflicting 3rd-party write to the server,
we write back just the bits that got changed locally.

However, there are a couple of problems with this:

 (1) I need a bit to note if the page might be mapped so that partial
     invalidation doesn't shrink the range.

 (2) There aren't necessarily sufficient bits to store the entire range of
     data altered (say it's a 32-bit system with 64KiB pages or transparent
     huge pages are in use).

So wrap the accesses in inline functions so that future commits can change
how this works.

Also move them out of the tracing header into the in-directory header.
There's not really any need for them to be in the tracing header.
Signed-off-by: NDavid Howells <dhowells@redhat.com>

185f0c70

cpufreq: Introduce cpufreq_driver_test_flags() · a62f68f5

由 Rafael J. Wysocki 提交于 10月 23, 2020

Add a helper function to test the flags of the cpufreq driver in use
againt a given flags mask.

In particular, this will be needed to test the
CPUFREQ_NEED_UPDATE_LIMITS cpufreq driver flag in the schedutil
governor.
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

a62f68f5

entry: Add support for TIF_NOTIFY_SIGNAL · 12db8b69

由 Jens Axboe 提交于 10月 26, 2020

Add TIF_NOTIFY_SIGNAL handling in the generic entry code, which if set,
will return true if signal_pending() is used in a wait loop. That causes an
exit of the loop so that notify_signal tracehooks can be run. If the wait
loop is currently inside a system call, the system call is restarted once
task_work has been processed.

In preparation for only having arch_do_signal() handle syscall restarts if
_TIF_SIGPENDING isn't set, rename it to arch_do_signal_or_restart().  Pass
in a boolean that tells the architecture specific signal handler if it
should attempt to get a signal, or just process a potential syscall
restart.

For !CONFIG_GENERIC_ENTRY archs, add the TIF_NOTIFY_SIGNAL handling to
get_signal(). This is done to minimize the needed architecture changes to
support this feature.
Signed-off-by: NJens Axboe <axboe@kernel.dk>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Reviewed-by: NOleg Nesterov <oleg@redhat.com>
Link: https://lore.kernel.org/r/20201026203230.386348-3-axboe@kernel.dk

12db8b69

signal: Add task_sigpending() helper · 5c251e9d

由 Jens Axboe 提交于 10月 26, 2020

This is in preparation for maintaining signal_pending() as the decider of
whether or not a schedule() loop should be broken, or continue sleeping.
This is different than the core signal use cases, which really need to know
whether an actual signal is pending or not. task_sigpending() returns
non-zero if TIF_SIGPENDING is set.

Only core kernel use cases should care about the distinction between
the two, make sure those use the task_sigpending() helper.
Signed-off-by: NJens Axboe <axboe@kernel.dk>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Reviewed-by: NThomas Gleixner <tglx@linutronix.de>
Reviewed-by: NOleg Nesterov <oleg@redhat.com>
Link: https://lore.kernel.org/r/20201026203230.386348-2-axboe@kernel.dk

5c251e9d

misc: mic: remove the MIC drivers · 80ade22c

由 Sudeep Dutt 提交于 10月 27, 2020

This patch removes the MIC drivers from the kernel tree
since the corresponding devices have been discontinued.

Removing the dma and char-misc changes in one patch and
merging via the char-misc tree is best to avoid any
potential build breakage.

Cc: Nikhil Rao <nikhil.rao@intel.com>
Reviewed-by: NAshutosh Dixit <ashutosh.dixit@intel.com>
Signed-off-by: NSudeep Dutt <sudeep.dutt@intel.com>
Acked-By: NVinod Koul <vkoul@kernel.org>
Reviewed-by: NSherry Sun <sherry.sun@nxp.com>
Link: https://lore.kernel.org/r/8c1443136563de34699d2c084df478181c205db4.1603854416.git.sudeep.dutt@intel.comSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

80ade22c

jbd2: fix a kernel-doc markup · ea4b01d9

由 Mauro Carvalho Chehab 提交于 10月 27, 2020

The kernel-doc markup that documents _fc_replay_callback is
missing an asterisk, causing this warning:

	../include/linux/jbd2.h:1271: warning: Function parameter or member 'j_fc_replay_callback' not described in 'journal_s'

When building the docs.

Fixes: 609f928af48f ("jbd2: fast commit recovery path")
Signed-off-by: NMauro Carvalho Chehab <mchehab+huawei@kernel.org>
Link: https://lore.kernel.org/r/6055927ada2015b55b413cdd2670533bdc9a8da2.1603791716.git.mchehab+huawei@kernel.orgSigned-off-by: NTheodore Ts'o <tytso@mit.edu>

ea4b01d9

ext4: make num of fast commit blocks configurable · e029c5f2

由 Harshad Shirwadkar 提交于 10月 26, 2020

This patch reserves a field in the jbd2 superblock for number of fast
commit blocks. When this value is non-zero, Ext4 uses this field to
set the number of fast commit blocks.

Fixes: 6866d7b3 ("ext4/jbd2: add fast commit initialization")
Signed-off-by: NHarshad Shirwadkar <harshadshirwadkar@gmail.com>
Link: https://lore.kernel.org/r/20201027044915.2553163-2-harshadshirwadkar@gmail.comSigned-off-by: NTheodore Ts'o <tytso@mit.edu>

e029c5f2

28 10月, 2020 6 次提交

module: use hidden visibility for weak symbol references · 13150bc5

由 Ard Biesheuvel 提交于 10月 27, 2020

Geert reports that commit be288182 ("arm64/build: Assert for
unwanted sections") results in build errors on arm64 for configurations
that have CONFIG_MODULES disabled.

The commit in question added ASSERT()s to the arm64 linker script to
ensure that linker generated sections such as .got.plt etc are empty,
but as it turns out, there are corner cases where the linker does emit
content into those sections. More specifically, weak references to
function symbols (which can remain unsatisfied, and can therefore not
be emitted as relative references) will be emitted as GOT and PLT
entries when linking the kernel in PIE mode (which is the case when
CONFIG_RELOCATABLE is enabled, which is on by default).

What happens is that code such as

	struct device *(*fn)(struct device *dev);
	struct device *iommu_device;

	fn = symbol_get(mdev_get_iommu_device);
	if (fn) {
		iommu_device = fn(dev);

essentially gets converted into the following when CONFIG_MODULES is off:

	struct device *iommu_device;

	if (&mdev_get_iommu_device) {
		iommu_device = mdev_get_iommu_device(dev);

where mdev_get_iommu_device is emitted as a weak symbol reference into
the object file. The first reference is decorated with an ordinary
ABS64 data relocation (which yields 0x0 if the reference remains
unsatisfied). However, the indirect call is turned into a direct call
covered by a R_AARCH64_CALL26 relocation, which is converted into a
call via a PLT entry taking the target address from the associated
GOT entry.

Given that such GOT and PLT entries are unnecessary for fully linked
binaries such as the kernel, let's give these weak symbol references
hidden visibility, so that the linker knows that the weak reference
via R_AARCH64_CALL26 can simply remain unsatisfied.
Signed-off-by: NArd Biesheuvel <ardb@kernel.org>
Tested-by: NGeert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: NFangrui Song <maskray@google.com>
Acked-by: NJessica Yu <jeyu@kernel.org>
Cc: Jessica Yu <jeyu@kernel.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Link: https://lore.kernel.org/r/20201027151132.14066-1-ardb@kernel.orgSigned-off-by: NWill Deacon <will@kernel.org>

13150bc5

usb: fix kernel-doc markups · cbdc0f54

由 Mauro Carvalho Chehab 提交于 10月 23, 2020

There is a common comment marked, instead, with kernel-doc
notation.

Also, some identifiers have different names between their
prototypes and the kernel-doc markup.
Signed-off-by: NMauro Carvalho Chehab <mchehab+huawei@kernel.org>
Acked-by: NFelipe Balbi <balbi@kernel.org>
Link: https://lore.kernel.org/r/0b964be3884def04fcd20ea5c12cb90d0014871c.1603469755.git.mchehab+huawei@kernel.orgSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

cbdc0f54

RDMA: Add rdma_connect_locked() · 071ba4cc

由 Jason Gunthorpe 提交于 10月 26, 2020

There are two flows for handling RDMA_CM_EVENT_ROUTE_RESOLVED, either the
handler triggers a completion and another thread does rdma_connect() or
the handler directly calls rdma_connect().

In all cases rdma_connect() needs to hold the handler_mutex, but when
handler's are invoked this is already held by the core code. This causes
ULPs using the 2nd method to deadlock.

Provide a rdma_connect_locked() and have all ULPs call it from their
handlers.

Link: https://lore.kernel.org/r/0-v2-53c22d5c1405+33-rdma_connect_locking_jgg@nvidia.comReported-and-tested-by: NGuoqing Jiang <guoqing.jiang@cloud.ionos.com>
Fixes: 2a7cec53 ("RDMA/cma: Fix locking for the RDMA_CM_CONNECT state")
Acked-by: NSantosh Shilimkar <santosh.shilimkar@oracle.com>
Acked-by: NJack Wang <jinpu.wang@cloud.ionos.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NMax Gurtovoy <mgurtovoy@nvidia.com>
Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
Signed-off-by: NJason Gunthorpe <jgg@nvidia.com>

071ba4cc

KVM: arm64: ARM_SMCCC_ARCH_WORKAROUND_1 doesn't return SMCCC_RET_NOT_REQUIRED · 1de111b5

由 Stephen Boyd 提交于 10月 23, 2020

According to the SMCCC spec[1](7.5.2 Discovery) the
ARM_SMCCC_ARCH_WORKAROUND_1 function id only returns 0, 1, and
SMCCC_RET_NOT_SUPPORTED.

 0 is "workaround required and safe to call this function"
 1 is "workaround not required but safe to call this function"
 SMCCC_RET_NOT_SUPPORTED is "might be vulnerable or might not be, who knows, I give up!"

SMCCC_RET_NOT_SUPPORTED might as well mean "workaround required, except
calling this function may not work because it isn't implemented in some
cases". Wonderful. We map this SMC call to

 0 is SPECTRE_MITIGATED
 1 is SPECTRE_UNAFFECTED
 SMCCC_RET_NOT_SUPPORTED is SPECTRE_VULNERABLE

For KVM hypercalls (hvc), we've implemented this function id to return
SMCCC_RET_NOT_SUPPORTED, 0, and SMCCC_RET_NOT_REQUIRED. One of those
isn't supposed to be there. Per the code we call
arm64_get_spectre_v2_state() to figure out what to return for this
feature discovery call.

 0 is SPECTRE_MITIGATED
 SMCCC_RET_NOT_REQUIRED is SPECTRE_UNAFFECTED
 SMCCC_RET_NOT_SUPPORTED is SPECTRE_VULNERABLE

Let's clean this up so that KVM tells the guest this mapping:

 0 is SPECTRE_MITIGATED
 1 is SPECTRE_UNAFFECTED
 SMCCC_RET_NOT_SUPPORTED is SPECTRE_VULNERABLE

Note: SMCCC_RET_NOT_AFFECTED is 1 but isn't part of the SMCCC spec

Fixes: c118bbb5 ("arm64: KVM: Propagate full Spectre v2 workaround state to KVM guests")
Signed-off-by: NStephen Boyd <swboyd@chromium.org>
Acked-by: NMarc Zyngier <maz@kernel.org>
Acked-by: NWill Deacon <will@kernel.org>
Cc: Andre Przywara <andre.przywara@arm.com>
Cc: Steven Price <steven.price@arm.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: stable@vger.kernel.org
Link: https://developer.arm.com/documentation/den0028/latest [1]
Link: https://lore.kernel.org/r/20201023154751.1973872-1-swboyd@chromium.orgSigned-off-by: NWill Deacon <will@kernel.org>

1de111b5

vmlinux.lds.h: Keep .ctors.* with .ctors · 3e663148

由 Kees Cook 提交于 10月 04, 2020

Under some circumstances, the compiler generates .ctors.* sections. This
is seen doing a cross compile of x86_64 from a powerpc64el host:

x86_64-linux-gnu-ld: warning: orphan section `.ctors.65435' from `kernel/trace/trace_clock.o' being
placed in section `.ctors.65435'
x86_64-linux-gnu-ld: warning: orphan section `.ctors.65435' from `kernel/trace/ftrace.o' being
placed in section `.ctors.65435'
x86_64-linux-gnu-ld: warning: orphan section `.ctors.65435' from `kernel/trace/ring_buffer.o' being
placed in section `.ctors.65435'

Include these orphans along with the regular .ctors section.
Reported-by: NStephen Rothwell <sfr@canb.auug.org.au>
Tested-by: NStephen Rothwell <sfr@canb.auug.org.au>
Fixes: 83109d5d ("x86/build: Warn on orphan section placement")
Signed-off-by: NKees Cook <keescook@chromium.org>
Acked-by: NNick Desaulniers <ndesaulniers@google.com>
Link: https://lore.kernel.org/r/20201005025720.2599682-1-keescook@chromium.org

3e663148

cpufreq: Introduce CPUFREQ_NEED_UPDATE_LIMITS driver flag · 1c534352

由 Rafael J. Wysocki 提交于 10月 23, 2020

Generally, a cpufreq driver may need to update some internal upper
and lower frequency boundaries on policy max and min changes,
respectively, but currently this does not work if the target
frequency does not change along with the policy limit.

Namely, if the target frequency does not change along with the
policy min or max, the "target_freq == policy->cur" check in
__cpufreq_driver_target() prevents driver callbacks from being
invoked and they do not even have a chance to update the
corresponding internal boundary.

This particularly affects the "powersave" and "performance"
governors that always set the target frequency to one of the
policy limits and it never changes when the other limit is updated.

To allow cpufreq the drivers needing to update internal frequency
boundaries on policy limits changes to avoid this issue, introduce
a new driver flag, CPUFREQ_NEED_UPDATE_LIMITS, that (when set) will
neutralize the check mentioned above.
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: NViresh Kumar <viresh.kumar@linaro.org>

1c534352

27 10月, 2020 4 次提交

asm-generic: mark __{get,put}_user_fn as __always_inline · 0bcd0a2b

由 Christoph Hellwig 提交于 10月 27, 2020

Without the explicit __always_inline, some RISC-V configs place the
functions out of line, triggering the BUILD_BUG_ON checks in the
function.

Fixes: 11129e8e ("riscv: use memcpy based uaccess for nommu again")
Reported-by: Nkernel test robot <lkp@intel.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NArnd Bergmann <arnd@arndb.de>

0bcd0a2b

drm: drm_print.h: fix kernel-doc markups · b52817e9

由 Mauro Carvalho Chehab 提交于 10月 27, 2020

A kernel-doc markup should start with the identifier on its
first line.
Signed-off-by: NMauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/5b76c5625709aaaa3abee98faa620b9f3d27ff85.1603791716.git.mchehab+huawei@kernel.org

b52817e9

drm: kernel-doc: drm_dp_helper.h: fix a typo · 38a8b32f

由 Mauro Carvalho Chehab 提交于 10月 27, 2020

Right now, kernel-doc generates a warning:
	./include/drm/drm_dp_helper.h:1786: warning: Function parameter or member 'hbr2_reset' not described in 'drm_dp_phy_test_params'

This is due to a typo:

	@hb2_reset -> @hbr2_reset
Signed-off-by: NMauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/2a615cb38e951215bb1bddc2481ad323c9cf3fc9.1603791716.git.mchehab+huawei@kernel.org

38a8b32f

drm: drm_edid: remove a duplicated kernel-doc declaration · 08989335

由 Mauro Carvalho Chehab 提交于 10月 27, 2020

It is not possible to create cross-references for duplicated
symbols. While Sphinx always detected it, on Sphinx 3 it
generates warnings like this:

	.../Documentation/gpu/drm-kms-helpers:326: ../drivers/gpu/drm/drm_edid.c:1626: WARNING: Duplicate C declaration, also defined in 'gpu/drm-kms-helpers'.
	Declaration is 'bool drm_edid_are_equal (const struct edid *edid1, const struct edid *edid2)'.

So, get rid of the duplicated kernel-doc markup.
Signed-off-by: NMauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/9310f4074fa9d29cd3ad60684d86d0ace8dab7ae.1603791716.git.mchehab+huawei@kernel.org

08989335

openeuler / Kernel 接近 2 年 前同步成功

openeuler / Kernel
接近 2 年前同步成功