提交 · 39218ff4c625dbf2e68224024fe0acaa60bcd51a · openeuler / Kernel

08 4月, 2021 1 次提交

stack: Optionally randomize kernel stack offset each syscall · 39218ff4

由 Kees Cook 提交于 4月 01, 2021

This provides the ability for architectures to enable kernel stack base
address offset randomization. This feature is controlled by the boot
param "randomize_kstack_offset=on/off", with its default value set by
CONFIG_RANDOMIZE_KSTACK_OFFSET_DEFAULT.

This feature is based on the original idea from the last public release
of PaX's RANDKSTACK feature: https://pax.grsecurity.net/docs/randkstack.txt
All the credit for the original idea goes to the PaX team. Note that
the design and implementation of this upstream randomize_kstack_offset
feature differs greatly from the RANDKSTACK feature (see below).

Reasoning for the feature:

This feature aims to make harder the various stack-based attacks that
rely on deterministic stack structure. We have had many such attacks in
past (just to name few):

https://jon.oberheide.org/files/infiltrate12-thestackisback.pdf
https://jon.oberheide.org/files/stackjacking-infiltrate11.pdf
https://googleprojectzero.blogspot.com/2016/06/exploiting-recursion-in-linux-kernel_20.html

As Linux kernel stack protections have been constantly improving
(vmap-based stack allocation with guard pages, removal of thread_info,
STACKLEAK), attackers have had to find new ways for their exploits
to work. They have done so, continuing to rely on the kernel's stack
determinism, in situations where VMAP_STACK and THREAD_INFO_IN_TASK_STRUCT
were not relevant. For example, the following recent attacks would have
been hampered if the stack offset was non-deterministic between syscalls:

https://repositorio-aberto.up.pt/bitstream/10216/125357/2/374717.pdf
(page 70: targeting the pt_regs copy with linear stack overflow)

https://a13xp0p0v.github.io/2020/02/15/CVE-2019-18683.html
(leaked stack address from one syscall as a target during next syscall)

The main idea is that since the stack offset is randomized on each system
call, it is harder for an attack to reliably land in any particular place
on the thread stack, even with address exposures, as the stack base will
change on the next syscall. Also, since randomization is performed after
placing pt_regs, the ptrace-based approach[1] to discover the randomized
offset during a long-running syscall should not be possible.

Design description:

During most of the kernel's execution, it runs on the "thread stack",
which is pretty deterministic in its structure: it is fixed in size,
and on every entry from userspace to kernel on a syscall the thread
stack starts construction from an address fetched from the per-cpu
cpu_current_top_of_stack variable. The first element to be pushed to the
thread stack is the pt_regs struct that stores all required CPU registers
and syscall parameters. Finally the specific syscall function is called,
with the stack being used as the kernel executes the resulting request.

The goal of randomize_kstack_offset feature is to add a random offset
after the pt_regs has been pushed to the stack and before the rest of the
thread stack is used during the syscall processing, and to change it every
time a process issues a syscall. The source of randomness is currently
architecture-defined (but x86 is using the low byte of rdtsc()). Future
improvements for different entropy sources is possible, but out of scope
for this patch. Further more, to add more unpredictability, new offsets
are chosen at the end of syscalls (the timing of which should be less
easy to measure from userspace than at syscall entry time), and stored
in a per-CPU variable, so that the life of the value does not stay
explicitly tied to a single task.

As suggested by Andy Lutomirski, the offset is added using alloca()
and an empty asm() statement with an output constraint, since it avoids
changes to assembly syscall entry code, to the unwinder, and provides
correct stack alignment as defined by the compiler.

In order to make this available by default with zero performance impact
for those that don't want it, it is boot-time selectable with static
branches. This way, if the overhead is not wanted, it can just be
left turned off with no performance impact.

The generated assembly for x86_64 with GCC looks like this:

...
ffffffff81003977: 65 8b 05 02 ea 00 7f  mov %gs:0x7f00ea02(%rip),%eax
					    # 12380 <kstack_offset>
ffffffff8100397e: 25 ff 03 00 00        and $0x3ff,%eax
ffffffff81003983: 48 83 c0 0f           add $0xf,%rax
ffffffff81003987: 25 f8 07 00 00        and $0x7f8,%eax
ffffffff8100398c: 48 29 c4              sub %rax,%rsp
ffffffff8100398f: 48 8d 44 24 0f        lea 0xf(%rsp),%rax
ffffffff81003994: 48 83 e0 f0           and $0xfffffffffffffff0,%rax
...

As a result of the above stack alignment, this patch introduces about
5 bits of randomness after pt_regs is spilled to the thread stack on
x86_64, and 6 bits on x86_32 (since its has 1 fewer bit required for
stack alignment). The amount of entropy could be adjusted based on how
much of the stack space we wish to trade for security.

My measure of syscall performance overhead (on x86_64):

lmbench: /usr/lib/lmbench/bin/x86_64-linux-gnu/lat_syscall -N 10000 null
    randomize_kstack_offset=y	Simple syscall: 0.7082 microseconds
    randomize_kstack_offset=n	Simple syscall: 0.7016 microseconds

So, roughly 0.9% overhead growth for a no-op syscall, which is very
manageable. And for people that don't want this, it's off by default.

There are two gotchas with using the alloca() trick. First,
compilers that have Stack Clash protection (-fstack-clash-protection)
enabled by default (e.g. Ubuntu[3]) add pagesize stack probes to
any dynamic stack allocations. While the randomization offset is
always less than a page, the resulting assembly would still contain
(unreachable!) probing routines, bloating the resulting assembly. To
avoid this, -fno-stack-clash-protection is unconditionally added to
the kernel Makefile since this is the only dynamic stack allocation in
the kernel (now that VLAs have been removed) and it is provably safe
from Stack Clash style attacks.

The second gotcha with alloca() is a negative interaction with
-fstack-protector*, in that it sees the alloca() as an array allocation,
which triggers the unconditional addition of the stack canary function
pre/post-amble which slows down syscalls regardless of the static
branch. In order to avoid adding this unneeded check and its associated
performance impact, architectures need to carefully remove uses of
-fstack-protector-strong (or -fstack-protector) in the compilation units
that use the add_random_kstack() macro and to audit the resulting stack
mitigation coverage (to make sure no desired coverage disappears). No
change is visible for this on x86 because the stack protector is already
unconditionally disabled for the compilation unit, but the change is
required on arm64. There is, unfortunately, no attribute that can be
used to disable stack protector for specific functions.

Comparison to PaX RANDKSTACK feature:

The RANDKSTACK feature randomizes the location of the stack start
(cpu_current_top_of_stack), i.e. including the location of pt_regs
structure itself on the stack. Initially this patch followed the same
approach, but during the recent discussions[2], it has been determined
to be of a little value since, if ptrace functionality is available for
an attacker, they can use PTRACE_PEEKUSR/PTRACE_POKEUSR to read/write
different offsets in the pt_regs struct, observe the cache behavior of
the pt_regs accesses, and figure out the random stack offset. Another
difference is that the random offset is stored in a per-cpu variable,
rather than having it be per-thread. As a result, these implementations
differ a fair bit in their implementation details and results, though
obviously the intent is similar.

[1] https://lore.kernel.org/kernel-hardening/2236FBA76BA1254E88B949DDB74E612BA4BC57C1@IRSMSX102.ger.corp.intel.com/
[2] https://lore.kernel.org/kernel-hardening/20190329081358.30497-1-elena.reshetova@intel.com/
[3] https://lists.ubuntu.com/archives/ubuntu-devel/2019-June/040741.htmlCo-developed-by: NElena Reshetova <elena.reshetova@intel.com>
Signed-off-by: NElena Reshetova <elena.reshetova@intel.com>
Signed-off-by: NKees Cook <keescook@chromium.org>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Reviewed-by: NThomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20210401232347.2791257-4-keescook@chromium.org

39218ff4

27 2月, 2021 4 次提交

lib: stackdepot: add support to disable stack depot · e1fdc403

由 Vijayanand Jitta 提交于 2月 25, 2021

Add a kernel parameter stack_depot_disable to disable stack depot.  So
that stack hash table doesn't consume any memory when stack depot is
disabled.

The use case is CONFIG_PAGE_OWNER without page_owner=on.  Without this
patch, stackdepot will consume the memory for the hashtable.  By default,
it's 8M which is never trivial.

With this option, in CONFIG_PAGE_OWNER configured system, page_owner=off,
stack_depot_disable in kernel command line, we could save the wasted
memory for the hashtable.

[akpm@linux-foundation.org: fix CONFIG_STACKDEPOT=n build]

Link: https://lkml.kernel.org/r/1611749198-24316-2-git-send-email-vjitta@codeaurora.orgSigned-off-by: NVinayak Menon <vinmenon@codeaurora.org>
Signed-off-by: NVijayanand Jitta <vjitta@codeaurora.org>
Cc: Alexander Potapenko <glider@google.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Yogesh Lal <ylal@codeaurora.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

e1fdc403

treewide: Miguel has moved · c131bd0b

由 Miguel Ojeda 提交于 2月 25, 2021

Update contact info.

Link: https://lkml.kernel.org/r/20210206162524.GA11520@kernel.orgSigned-off-by: NMiguel Ojeda <ojeda@kernel.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

c131bd0b

Documentation: sysfs/memory: clarify some memory block device properties · a89107c0

由 David Hildenbrand 提交于 2月 25, 2021

In commit 53cdc1cb ("drivers/base/memory.c: indicate all memory blocks
as removable") we changed the output of the "removable" property of memory
devices to return "1" if and only if the kernel supports memory offlining.

Let's update documentation, stating that the interface is legacy.  Also
update documentation of the "state" property and "valid_zones" properties.

Link: https://lkml.kernel.org/r/20210201181347.13262-3-david@redhat.comSigned-off-by: NDavid Hildenbrand <david@redhat.com>
Acked-by: NMichal Hocko <mhocko@suse.com>
Reviewed-by: NOscar Salvador <osalvador@suse.de>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: David Hildenbrand <david@redhat.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Cc: Ilya Dryomov <idryomov@gmail.com>
Cc: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Cc: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

a89107c0

drivers/base/memory: don't store phys_device in memory blocks · e9a2e48e

由 David Hildenbrand 提交于 2月 25, 2021

No need to store the value for each and every memory block, as we can
easily query the value at runtime.  Reshuffle the members to optimize the
memory layout.  Also, let's clarify what the interface once was used for
and why it's legacy nowadays.

"phys_device" was used on s390x in older versions of lsmem[2]/chmem[3],
back when they were still part of s390x-tools.  They were later replaced
by the variants in linux-utils.  For example, RHEL6 and RHEL7 contain
lsmem/chmem from s390-utils.  RHEL8 switched to versions from util-linux
on s390x [4].

"phys_device" was added with sysfs support for memory hotplug in commit
3947be19 ("[PATCH] memory hotplug: sysfs and add/remove functions") in
2005.  It always returned 0.

s390x started returning something != 0 on some setups (if sclp.rzm is set
by HW) in 2010 via commit 57b552ba ("memory hotplug/s390: set
phys_device").

For s390x, it allowed for identifying which memory block devices belong to
the same storage increment (RZM).  Only if all memory block devices
comprising a single storage increment were offline, the memory could
actually be removed in the hypervisor.

Since commit e5d709bb ("s390/memory hotplug: provide
memory_block_size_bytes() function") in 2013 a memory block device spans
at least one storage increment - which is why the interface isn't really
helpful/used anymore (except by old lsmem/chmem tools).

There were once RFC patches to make use of "phys_device" in ACPI context;
however, the underlying problem could be solved using different interfaces
[1].

[1] https://patchwork.kernel.org/patch/2163871/
[2] https://github.com/ibm-s390-tools/s390-tools/blob/v2.1.0/zconf/lsmem
[3] https://github.com/ibm-s390-tools/s390-tools/blob/v2.1.0/zconf/chmem
[4] https://bugzilla.redhat.com/show_bug.cgi?id=1504134

Link: https://lkml.kernel.org/r/20210201181347.13262-2-david@redhat.comSigned-off-by: NDavid Hildenbrand <david@redhat.com>
Acked-by: NMichal Hocko <mhocko@suse.com>
Reviewed-by: NOscar Salvador <osalvador@suse.de>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Gerald Schaefer <gerald.schaefer@linux.ibm.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Cc: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Cc: Ilya Dryomov <idryomov@gmail.com>
Cc: Vaibhav Jain <vaibhav@linux.ibm.com>
Cc: Tom Rix <trix@redhat.com>
Cc: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

e9a2e48e

26 2月, 2021 1 次提交

Documentation: cgroup-v2: fix path to example BPF program · 43c4f657

由 Antonio Terceiro 提交于 2月 24, 2021

This file has been moved into the "progs" subdirectory, together with all
test BPF programs.

Fixes: bd4aed0e ("selftests: bpf: centre kernel bpf objects under new subdir "progs"")
Signed-off-by: NAntonio Terceiro <antonio.terceiro@linaro.org>
Cc: Tejun Heo <tj@kernel.org>
Cc: Zefan Li <lizefan.x@bytedance.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Jiong Wang <jiong.wang@netronome.com>
Link: https://lore.kernel.org/r/20210224131631.349287-1-antonio.terceiro@linaro.orgSigned-off-by: NJonathan Corbet <corbet@lwn.net>

43c4f657

25 2月, 2021 4 次提交

mm/vmscan: restore zone_reclaim_mode ABI · 51998364

由 Dave Hansen 提交于 2月 24, 2021

I went to go add a new RECLAIM_* mode for the zone_reclaim_mode sysctl.
Like a good kernel developer, I also went to go update the
documentation.  I noticed that the bits in the documentation didn't
match the bits in the #defines.

The VM never explicitly checks the RECLAIM_ZONE bit.  The bit is,
however implicitly checked when checking 'node_reclaim_mode==0'.  The
RECLAIM_ZONE #define was removed in a cleanup.  That, by itself is fine.

But, when the bit was removed (bit 0) the _other_ bit locations also got
changed.  That's not OK because the bit values are documented to mean
one specific thing.  Users surely do not expect the meaning to change
from kernel to kernel.

The end result is that if someone had a script that did:

	sysctl vm.zone_reclaim_mode=1

it would have gone from enabling node reclaim for clean unmapped pages
to writing out pages during node reclaim after the commit in question.
That's not great.

Put the bits back the way they were and add a comment so something like
this is a bit harder to do again.  Update the documentation to make it
clear that the first bit is ignored.

Link: https://lkml.kernel.org/r/20210219172555.FF0CDF23@viggo.jf.intel.comSigned-off-by: NDave Hansen <dave.hansen@linux.intel.com>
Fixes: 648b5cf3 ("mm/vmscan: remove unused RECLAIM_OFF/RECLAIM_ZONE")
Reviewed-by: NBen Widawsky <ben.widawsky@intel.com>
Reviewed-by: NOscar Salvador <osalvador@suse.de>
Acked-by: NDavid Rientjes <rientjes@google.com>
Acked-by: NChristoph Lameter <cl@linux.com>
Cc: Alex Shi <alex.shi@linux.alibaba.com>
Cc: Daniel Wagner <dwagner@suse.de>
Cc: "Tobin C. Harding" <tobin@kernel.org>
Cc: Christoph Lameter <cl@linux.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Huang Ying <ying.huang@intel.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Qian Cai <cai@lca.pw>
Cc: <stable@vger.kernel.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

51998364

mm: memcg: add swapcache stat for memcg v2 · b6038942

由 Shakeel Butt 提交于 2月 24, 2021

This patch adds swapcache stat for the cgroup v2.  The swapcache
represents the memory that is accounted against both the memory and the
swap limit of the cgroup.  The main motivation behind exposing the
swapcache stat is for enabling users to gracefully migrate from cgroup
v1's memsw counter to cgroup v2's memory and swap counters.

Cgroup v1's memsw limit allows users to limit the memory+swap usage of a
workload but without control on the exact proportion of memory and swap.
Cgroup v2 provides separate limits for memory and swap which enables more
control on the exact usage of memory and swap individually for the
workload.

With some little subtleties, the v1's memsw limit can be switched with the
sum of the v2's memory and swap limits.  However the alternative for memsw
usage is not yet available in cgroup v2.  Exposing per-cgroup swapcache
stat enables that alternative.  Adding the memory usage and swap usage and
subtracting the swapcache will approximate the memsw usage.  This will
help in the transparent migration of the workloads depending on memsw
usage and limit to v2' memory and swap counters.

The reasons these applications are still interested in this approximate
memsw usage are: (1) these applications are not really interested in two
separate memory and swap usage metrics.  A single usage metric is more
simple to use and reason about for them.

(2) The memsw usage metric hides the underlying system's swap setup from
the applications.  Applications with multiple instances running in a
datacenter with heterogeneous systems (some have swap and some don't) will
keep seeing a consistent view of their usage.

[akpm@linux-foundation.org: fix CONFIG_SWAP=n build]

Link: https://lkml.kernel.org/r/20210108155813.2914586-3-shakeelb@google.comSigned-off-by: NShakeel Butt <shakeelb@google.com>
Acked-by: NMichal Hocko <mhocko@suse.com>
Reviewed-by: NRoman Gushchin <guro@fb.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Muchun Song <songmuchun@bytedance.com>
Cc: Yang Shi <shy828301@gmail.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

b6038942

mm, slub: remove slub_memcg_sysfs boot param and CONFIG_SLUB_MEMCG_SYSFS_ON · fe2cce15

由 Vlastimil Babka 提交于 2月 24, 2021

The boot param and config determine the value of memcg_sysfs_enabled,
which is unused since commit 10befea9 ("mm: memcg/slab: use a single
set of kmem_caches for all allocations") as there are no per-memcg kmem
caches anymore.

Link: https://lkml.kernel.org/r/20210127124745.7928-1-vbabka@suse.czSigned-off-by: NVlastimil Babka <vbabka@suse.cz>
Reviewed-by: NDavid Hildenbrand <david@redhat.com>
Acked-by: NRoman Gushchin <guro@fb.com>
Acked-by: NDavid Rientjes <rientjes@google.com>
Reviewed-by: NMiaohe Lin <linmiaohe@huawei.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

fe2cce15

xfs: restore speculative_cow_prealloc_lifetime sysctl · 89e0eb8c

由 Darrick J. Wong 提交于 2月 12, 2021

In commit 9669f51d I tried to get rid of the undocumented cow gc
lifetime knob.  The knob's function was never documented and it now
doesn't really have a function since eof and cow gc have been
consolidated.

Regrettably, xfs/231 relies on it and regresses on for-next.  I did not
succeed at getting far enough through fstests patch review for the fixup
to land in time.

Restore the sysctl knob, document what it did (does?), put it on the
deprecation schedule, and rip out a redundant function.

Fixes: 9669f51d ("xfs: consolidate the eofblocks and cowblocks workers")
Signed-off-by: NDarrick J. Wong <djwong@kernel.org>
Reviewed-by: NDave Chinner <dchinner@redhat.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>

89e0eb8c

17 2月, 2021 2 次提交

preempt: Introduce CONFIG_PREEMPT_DYNAMIC · 6ef869e0

由 Michal Hocko 提交于 1月 18, 2021

Preemption mode selection is currently hardcoded on Kconfig choices.
Introduce a dedicated option to tune preemption flavour at boot time,

This will be only available on architectures efficiently supporting
static calls in order not to tempt with the feature against additional
overhead that might be prohibitive or undesirable.

CONFIG_PREEMPT_DYNAMIC is automatically selected by CONFIG_PREEMPT if
the architecture provides the necessary support (CONFIG_STATIC_CALL_INLINE,
CONFIG_GENERIC_ENTRY, and provide with __preempt_schedule_function() /
__preempt_schedule_notrace_function()).
Suggested-by: NPeter Zijlstra <peterz@infradead.org>
Signed-off-by: NMichal Hocko <mhocko@suse.com>
Signed-off-by: NFrederic Weisbecker <frederic@kernel.org>
[peterz: relax requirement to HAVE_STATIC_CALL]
Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: NIngo Molnar <mingo@kernel.org>
Link: https://lkml.kernel.org/r/20210118141223.123667-5-frederic@kernel.org

6ef869e0

cifs: documentation cleanup · 731ddc09

由 Steve French 提交于 2月 15, 2021

Various minor changes to the admin-guide for cifs
Signed-off-by: NSteve French <stfrench@microsoft.com>

731ddc09

15 2月, 2021 1 次提交

lib/vsprintf: no_hash_pointers prints all addresses as unhashed · 5ead723a

由 Timur Tabi 提交于 2月 14, 2021

If the no_hash_pointers command line parameter is set, then
printk("%p") will print pointers as unhashed, which is useful for
debugging purposes.  This change applies to any function that uses
vsprintf, such as print_hex_dump() and seq_buf_printf().

A large warning message is displayed if this option is enabled.
Unhashed pointers expose kernel addresses, which can be a security
risk.

Also update test_printf to skip the hashed pointer tests if the
command-line option is set.
Signed-off-by: NTimur Tabi <timur@kernel.org>
Acked-by: NPetr Mladek <pmladek@suse.com>
Acked-by: NRandy Dunlap <rdunlap@infradead.org>
Acked-by: NSergey Senozhatsky <sergey.senozhatsky@gmail.com>
Acked-by: NVlastimil Babka <vbabka@suse.cz>
Acked-by: NMarco Elver <elver@google.com>
Signed-off-by: NPetr Mladek <pmladek@suse.com>
Link: https://lore.kernel.org/r/20210214161348.369023-4-timur@kernel.org

5ead723a

12 2月, 2021 2 次提交

Documentation/admin-guide: kernel-parameters: Update nohlt section · 3cae85f5

由 Florian Fainelli 提交于 2月 09, 2021

Update the documentation regarding "nohlt" and indicate that it is not
only for bugs, but can be useful to disable the architecture specific
sleep instructions. ARM, ARM64, SuperH and Microblaze all use
CONFIG_GENERIC_IDLE_POLL_SETUP which takes care of honoring the
"hlt"/"nohlt" parameters.
Signed-off-by: NFlorian Fainelli <f.fainelli@gmail.com>
Acked-by: NRandy Dunlap <rdunlap@infradead.org>
Link: https://lore.kernel.org/r/20210209172349.2249596-1-f.fainelli@gmail.comSigned-off-by: NJonathan Corbet <corbet@lwn.net>

3cae85f5

doc/admin-guide: fix spelling mistake: "perfomance" -> "performance" · a15cb2c1

由 Colin Ian King 提交于 2月 10, 2021

There is a spelling mistake in the perf-security documentation. Fix it.
Signed-off-by: NColin Ian King <colin.king@canonical.com>
Link: https://lore.kernel.org/r/20210210115624.53551-1-colin.king@canonical.comSigned-off-by: NJonathan Corbet <corbet@lwn.net>

a15cb2c1

10 2月, 2021 1 次提交

xfs: fix rst syntax error in admin guide · 8e8794b9

由 Darrick J. Wong 提交于 2月 08, 2021

Tables are supposed to have a matching line of "===" to signal the end
of a table.  The rst compiler gets grouchy if it encounters EOF instead,
so fix this warning.
Reported-by: NStephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: NDarrick J. Wong <djwong@kernel.org>
Reviewed-by: NEric Sandeen <sandeen@redhat.com>
Reviewed-by: NChaitanya Kulkarni <chaitanya.kulkarni@wdc.com>

8e8794b9

09 2月, 2021 5 次提交

x86/apb_timer: Remove driver for deprecated platform · 1b79fc4f

由 Andy Shevchenko 提交于 1月 25, 2021

Intel Moorestown and Medfield are quite old Intel Atom based
32-bit platforms, which were in limited use in some Android phones,
tablets and consumer electronics more than eight years ago.

There are no bugs or problems ever reported outside from Intel
for breaking any of that platforms for years. It seems no real
users exists who run more or less fresh kernel on it. Commit
05f4434b ("ASoC: Intel: remove mfld_machine") is also in align
with this theory.

Due to above and to reduce a burden of supporting outdated drivers,
remove the support for outdated platforms completely.
Signed-off-by: NAndy Shevchenko <andriy.shevchenko@linux.intel.com>
Acked-by: NLinus Walleij <linus.walleij@linaro.org>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

1b79fc4f

arm64: cpufeatures: Allow disabling of Pointer Auth from the command-line · f8da5752

由 Marc Zyngier 提交于 2月 08, 2021

In order to be able to disable Pointer Authentication  at runtime,
whether it is for testing purposes, or to work around HW issues,
let's add support for overriding the ID_AA64ISAR1_EL1.{GPI,GPA,API,APA}
fields.

This is further mapped on the arm64.nopauth command-line alias.
Signed-off-by: NMarc Zyngier <maz@kernel.org>
Reviewed-by: NCatalin Marinas <catalin.marinas@arm.com>
Acked-by: NDavid Brazdil <dbrazdil@google.com>
Tested-by: NSrinivas Ramana <sramana@codeaurora.org>
Link: https://lore.kernel.org/r/20210208095732.3267263-23-maz@kernel.orgSigned-off-by: NWill Deacon <will@kernel.org>

f8da5752

arm64: cpufeatures: Allow disabling of BTI from the command-line · 93ad55b7

由 Marc Zyngier 提交于 2月 08, 2021

In order to be able to disable BTI at runtime, whether it is
for testing purposes, or to work around HW issues, let's add
support for overriding the ID_AA64PFR1_EL1.BTI field.

This is further mapped on the arm64.nobti command-line alias.
Signed-off-by: NMarc Zyngier <maz@kernel.org>
Reviewed-by: NCatalin Marinas <catalin.marinas@arm.com>
Acked-by: NDavid Brazdil <dbrazdil@google.com>
Tested-by: NSrinivas Ramana <sramana@codeaurora.org>
Link: https://lore.kernel.org/r/20210208095732.3267263-21-maz@kernel.orgSigned-off-by: NWill Deacon <will@kernel.org>

93ad55b7

arm64: Make kvm-arm.mode={nvhe, protected} an alias of id_aa64mmfr1.vh=0 · 1945a067

由 Marc Zyngier 提交于 2月 08, 2021

Admitedly, passing id_aa64mmfr1.vh=0 on the command-line isn't
that easy to understand, and it is likely that users would much
prefer write "kvm-arm.mode=nvhe", or "...=protected".

So here you go. This has the added advantage that we can now
always honor the "kvm-arm.mode=protected" option, even when
booting on a VHE system.
Signed-off-by: NMarc Zyngier <maz@kernel.org>
Acked-by: NCatalin Marinas <catalin.marinas@arm.com>
Acked-by: NDavid Brazdil <dbrazdil@google.com>
Link: https://lore.kernel.org/r/20210208095732.3267263-18-maz@kernel.orgSigned-off-by: NWill Deacon <will@kernel.org>

1945a067

driver core: Add fw_devlink.strict kernel param · 19d0f5f6

由 Saravana Kannan 提交于 2月 05, 2021

This param allows forcing all dependencies to be treated as mandatory.
This will be useful for boards in which all optional dependencies like
IOMMUs and DMAs need to be treated as mandatory dependencies.
Tested-by: NMarek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: NSaravana Kannan <saravanak@google.com>
Link: https://lore.kernel.org/r/20210205222644.2357303-4-saravanak@google.comSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

19d0f5f6

06 2月, 2021 1 次提交

entry: Use different define for selector variable in SUD · 36a6c843

由 Gabriel Krisman Bertazi 提交于 2月 05, 2021

Michael Kerrisk suggested that, from an API perspective, it is a bad
idea to share the PR_SYS_DISPATCH_ defines between the prctl operation
and the selector variable.

Therefore, define two new constants to be used by SUD's selector variable
and update the corresponding documentation and test cases.

While this changes the API syscall user dispatch has never been part of a
Linux release, it will show up for the first time in 5.11.
Suggested-by: NMichael Kerrisk (man-pages) <mtk.manpages@gmail.com>
Signed-off-by: NGabriel Krisman Bertazi <krisman@collabora.com>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20210205184321.2062251-1-krisman@collabora.com

36a6c843

05 2月, 2021 1 次提交

Documentation: admin-guide: Update kvm/xen config option · 61ffd285

由 André Almeida 提交于 1月 29, 2021

Since commit 9bba03d4 ("kconfig: remove 'kvmconfig' and 'xenconfig'
shorthands") kvm/xen config shortcuts are not available anymore. Update
the file to reflect how they should be used, with the full filename.
Signed-off-by: NAndré Almeida <andrealmeid@collabora.com>
Link: https://lore.kernel.org/r/20210130014547.123006-2-andrealmeid@collabora.comSigned-off-by: NJonathan Corbet <corbet@lwn.net>

61ffd285

04 2月, 2021 3 次提交

thunderbolt: Add support for PCIe tunneling disabled (SL5) · 3cd542e6

由 Mika Westerberg 提交于 9月 03, 2020

Recent Intel Thunderbolt firmware connection manager has support for
another security level, SL5, that disables PCIe tunneling. This option
can be turned on from the BIOS.

When this is set the driver exposes a new security level "nopcie" to the
userspace and hides the authorized attribute under connected devices.

While there we also hide it when "dponly" security level is enabled
since it is not really usable in that case anyway.
Signed-off-by: NMika Westerberg <mika.westerberg@linux.intel.com>
Acked-by: NYehezkel Bernat <YehezkelShB@gmail.com>

3cd542e6

xfs: expose the blockgc workqueue knobs publicly · 47bd6d34

由 Darrick J. Wong 提交于 1月 25, 2021

Expose the workqueue sysfs knobs for the speculative preallocation gc
workers on all kernels, and update the sysadmin information.
Signed-off-by: NDarrick J. Wong <djwong@kernel.org>
Reviewed-by: NChristoph Hellwig <hch@lst.de>

47bd6d34

xfs: increase the default parallelism levels of pwork clients · f83d436a

由 Darrick J. Wong 提交于 1月 22, 2021

Increase the parallelism level for pwork clients to the workqueue
defaults so that we can take advantage of computers with a lot of CPUs
and a lot of hardware.  On fast systems this will speed up quotacheck by
a large factor, and the following posteof/cowblocks cleanup series will
use the functionality presented in this patch to run garbage collection
as quickly as possible.

We do this by switching the pwork workqueue to unbounded, since the
current user (quotacheck) runs lengthy scans for each work item and we
don't care about dispatching the work on a warm cpu cache or anything
like that.  Also set WQ_SYSFS so that we can monitor where the wq is
running.
Signed-off-by: NDarrick J. Wong <djwong@kernel.org>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NBrian Foster <bfoster@redhat.com>

f83d436a

03 2月, 2021 2 次提交

dm crypt: support using trusted keys · 363880c4

由 Ahmad Fatoum 提交于 1月 22, 2021

Commit 27f5411a ("dm crypt: support using encrypted keys") extended
dm-crypt to allow use of "encrypted" keys along with "user" and "logon".

Along the same lines, teach dm-crypt to support "trusted" keys as well.
Signed-off-by: NAhmad Fatoum <a.fatoum@pengutronix.de>
Signed-off-by: NMike Snitzer <snitzer@redhat.com>

363880c4

dm integrity: introduce the "fix_hmac" argument · 09d85f8d

由 Mikulas Patocka 提交于 1月 21, 2021

The "fix_hmac" argument improves security of internal_hash and
journal_mac:
- the section number is mixed to the mac, so that an attacker can't
  copy sectors from one journal section to another journal section
- the superblock is protected by journal_mac
- a 16-byte salt stored in the superblock is mixed to the mac, so
  that the attacker can't detect that two disks have the same hmac
  key and also to disallow the attacker to move sectors from one
  disk to another
Signed-off-by: NMikulas Patocka <mpatocka@redhat.com>
Reported-by: NDaniel Glockner <dg@emlix.com>
Signed-off-by: Lukas Bulwahn <lukas.bulwahn@gmail.com> # ReST fix
Tested-by: NMilan Broz <gmazyland@gmail.com>
Signed-off-by: NMike Snitzer <snitzer@redhat.com>

09d85f8d

02 2月, 2021 2 次提交

platform/x86: thinkpad_acpi: fixed warning and incorporated review comments · cfa75cca

由 Nitin Joshi 提交于 2月 02, 2021

The previous commit adding new sysfs for keyboard language has warning and
few code correction has to be done as per new review comments.

Below changes has been addressed in this version:
 - corrected warning. Many thanks to kernel test robot <lkp@intel.com> for
   reporting and determining this warning.
 - used sysfs_emit_at() API instead of strcat.
 - sorted keyboard language array.
 - removed unwanted space and corrected sentences.
Reported-by: Nkernel test robot <lkp@intel.com>
Signed-off-by: NNitin Joshi <njoshi1@lenovo.com>
Reviewed-by: NHans de Goede <hdegoede@redhat.com>
Link: https://lore.kernel.org/r/20210202003210.91773-1-njoshi1@lenovo.comSigned-off-by: NHans de Goede <hdegoede@redhat.com>

cfa75cca

platform/x86: thinkpad_acpi: rectify length of title underline · a78b96fe

由 Lukas Bulwahn 提交于 1月 29, 2021

Commit d7cbe277 ("platform/x86: thinkpad_acpi: set keyboard language")
adds information on keyboard setting to the thinkpad documentation, but
made the subsection title underline too short.

Hence, make htmldocs warns:

  Documentation/admin-guide/laptops/thinkpad-acpi.rst:1472: \
    WARNING: Title underline too short.

Rectify length of subsection title underline.
Signed-off-by: NLukas Bulwahn <lukas.bulwahn@gmail.com>
Link: https://lore.kernel.org/r/20210129040849.26740-1-lukas.bulwahn@gmail.comSigned-off-by: NHans de Goede <hdegoede@redhat.com>

a78b96fe

29 1月, 2021 6 次提交

crypto: salsa20 - remove Salsa20 stream cipher algorithm · 663f63ee

由 Ard Biesheuvel 提交于 1月 21, 2021

Salsa20 is not used anywhere in the kernel, is not suitable for disk
encryption, and widely considered to have been superseded by ChaCha20.
So let's remove it.
Signed-off-by: NArd Biesheuvel <ardb@kernel.org>
Acked-by: NMike Snitzer <snitzer@redhat.com>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>

663f63ee

drivers: Remove CONFIG_OPROFILE support · f8408264

由 Viresh Kumar 提交于 1月 14, 2021

The "oprofile" user-space tools don't use the kernel OPROFILE support
any more, and haven't in a long time. User-space has been converted to
the perf interfaces.

Remove kernel's old oprofile support.
Suggested-by: NChristoph Hellwig <hch@infradead.org>
Suggested-by: NLinus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: NViresh Kumar <viresh.kumar@linaro.org>
Acked-by: NRobert Richter <rric@kernel.org>
Acked-by: Paul E. McKenney <paulmck@kernel.org> #RCU
Acked-by: NWilliam Cohen <wcohen@redhat.com>
Acked-by: NAl Viro <viro@zeniv.linux.org.uk>
Acked-by: NThomas Gleixner <tglx@linutronix.de>

f8408264

Update Documentation/admin-guide/sysctl/fs.rst · c66cb171

由 Eric Curtin 提交于 1月 20, 2021

max_user_watches for epoll should say 1/25, rather than 1/32
Signed-off-by: NEric Curtin <ericcurtin17@gmail.com>
Link: https://lore.kernel.org/r/20210120132648.19046-1-ericcurtin17@gmail.comSigned-off-by: NJonathan Corbet <corbet@lwn.net>

c66cb171

Documentation/admin-guide: kernel-parameters: update CMA entries · bc47190d

由 Randy Dunlap 提交于 1月 24, 2021

Add qualifying build option legend [CMA] to kernel boot options
that requirce CMA support to be enabled for them to be usable.

Also capitalize 'CMA' when it is used as an acronym.
Signed-off-by: NRandy Dunlap <rdunlap@infradead.org>
Acked-by: NMike Kravetz <mike.kravetz@oracle.com>
Link: https://lore.kernel.org/r/20210125043202.22399-1-rdunlap@infradead.orgSigned-off-by: NJonathan Corbet <corbet@lwn.net>

bc47190d

Documentation: kernel-parameters: add missing '<' · 187623b1

由 Wolfram Sang 提交于 1月 27, 2021

Acked-by: NRandy Dunlap <rdunlap@infradead.org>
Signed-off-by: NWolfram Sang <wsa+renesas@sang-engineering.com>
Link: https://lore.kernel.org/r/20210127104343.5647-1-wsa+renesas@sang-engineering.comSigned-off-by: NJonathan Corbet <corbet@lwn.net>

187623b1

perf/arm-cmn: Fix PMU instance naming · 79d7c3dc

由 Robin Murphy 提交于 1月 28, 2021

Although it's neat to avoid the suffix for the typical case of a
single PMU, it means systems with multiple CMN instances end up with
inconsistent naming. I think it also breaks perf tool's "uncore alias"
logic if the common instance prefix is also the full name of one.

Avoid any surprises by not trying to be clever and simply numbering
every instance, even when it might technically prove redundant.

Fixes: 0ba64770 ("perf: Add Arm CMN-600 PMU driver")
Signed-off-by: NRobin Murphy <robin.murphy@arm.com>
Link: https://lore.kernel.org/r/649a2281233f193d59240b13ed91b57337c77b32.1611839564.git.robin.murphy@arm.comSigned-off-by: NWill Deacon <will@kernel.org>

79d7c3dc

28 1月, 2021 2 次提交

media: rockchip: rkisp1: carry ip version information · fc672d80

由 Heiko Stuebner 提交于 1月 21, 2021

The IP block evolved from its rk3288/rk3399 base and the vendor
designates them with a numerical version. rk3399 for example
is designated V10 probably meaning V1.0.

There doesn't seem to be an actual version register we could read that
information from, so allow the match_data to carry that information
for future differentiation.

Also carry that information in the hw_revision field of the media-
controller API, so that userspace also has access to that.

The added versions are:
- V10: at least rk3288 + rk3399
- V11: seemingly unused as of now, but probably appeared in some soc
- V12: at least rk3326 + px30
- V13: at least rk1808

[fix checkpatch warning don't use multiple blank lines]
Signed-off-by: NHeiko Stuebner <heiko.stuebner@theobroma-systems.com>
Signed-off-by: NDafna Hirschfeld <dafna.hirschfeld@collabora.com>
Reviewed-by: NEzequiel Garcia <ezequiel@collabora.com>
Acked-by: NHelen Koike <helen.koike@collabora.com>
Signed-off-by: NHans Verkuil <hverkuil-cisco@xs4all.nl>
Signed-off-by: NMauro Carvalho Chehab <mchehab+huawei@kernel.org>

fc672d80

perf/intel: Remove Perfmon-v4 counter_freezing support · 3daa96d6

由 Peter Zijlstra 提交于 11月 10, 2020

Perfmon-v4 counter freezing is fundamentally broken; remove this default
disabled code to make sure nobody uses it.

The feature is called Freeze-on-PMI in the SDM, and if it would do that,
there wouldn't actually be a problem, *however* it does something subtly
different. It globally disables the whole PMU when it raises the PMI,
not when the PMI hits.

This means there's a window between the PMI getting raised and the PMI
actually getting served where we loose events and this violates the
perf counter independence. That is, a counting event should not result
in a different event count when there is a sampling event co-scheduled.

This is known to break existing software (RR).
Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>

3daa96d6

27 1月, 2021 2 次提交

speakup: Add documentation on changing the speakup messages language · cae2181b

由 Samuel Thibault 提交于 1月 26, 2021

This documents how to use speakup_setlocale to set the speakup messages
language.
Signed-off-by: NSamuel Thibault <samuel.thibault@ens-lyon.org>
Link: https://lore.kernel.org/r/20210126222147.3848175-5-samuel.thibault@ens-lyon.orgSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

cae2181b

dmaengine: idxd: add module parameter to force disable of SVA · 03d939c7

由 Dave Jiang 提交于 1月 22, 2021

Add a module parameter that overrides the SVA feature enabling. This keeps
the driver in legacy mode even when intel_iommu=sm_on is set. In this mode,
the descriptor fields must be programmed with dma_addr_t from the Linux DMA
API for source, destination, and completion descriptors.
Signed-off-by: NDave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/161134110457.4005461.13171197785259115852.stgit@djiang5-desk3.ch.intel.comSigned-off-by: NVinod Koul <vkoul@kernel.org>

03d939c7

openeuler / Kernel 接近 2 年 前同步成功

openeuler / Kernel
接近 2 年前同步成功