提交 · c1d0da83358a2316d9be7f229f26126dbaa07468 · openeuler / Kernel

27 9月, 2020 2 次提交

mm: replace memmap_context by meminit_context · c1d0da83

由 Laurent Dufour 提交于 9月 25, 2020

Patch series "mm: fix memory to node bad links in sysfs", v3.

Sometimes, firmware may expose interleaved memory layout like this:

 Early memory node ranges
   node   1: [mem 0x0000000000000000-0x000000011fffffff]
   node   2: [mem 0x0000000120000000-0x000000014fffffff]
   node   1: [mem 0x0000000150000000-0x00000001ffffffff]
   node   0: [mem 0x0000000200000000-0x000000048fffffff]
   node   2: [mem 0x0000000490000000-0x00000007ffffffff]

In that case, we can see memory blocks assigned to multiple nodes in
sysfs:

  $ ls -l /sys/devices/system/memory/memory21
  total 0
  lrwxrwxrwx 1 root root     0 Aug 24 05:27 node1 -> ../../node/node1
  lrwxrwxrwx 1 root root     0 Aug 24 05:27 node2 -> ../../node/node2
  -rw-r--r-- 1 root root 65536 Aug 24 05:27 online
  -r--r--r-- 1 root root 65536 Aug 24 05:27 phys_device
  -r--r--r-- 1 root root 65536 Aug 24 05:27 phys_index
  drwxr-xr-x 2 root root     0 Aug 24 05:27 power
  -r--r--r-- 1 root root 65536 Aug 24 05:27 removable
  -rw-r--r-- 1 root root 65536 Aug 24 05:27 state
  lrwxrwxrwx 1 root root     0 Aug 24 05:25 subsystem -> ../../../../bus/memory
  -rw-r--r-- 1 root root 65536 Aug 24 05:25 uevent
  -r--r--r-- 1 root root 65536 Aug 24 05:27 valid_zones

The same applies in the node's directory with a memory21 link in both
the node1 and node2's directory.

This is wrong but doesn't prevent the system to run.  However when
later, one of these memory blocks is hot-unplugged and then hot-plugged,
the system is detecting an inconsistency in the sysfs layout and a
BUG_ON() is raised:

  kernel BUG at /Users/laurent/src/linux-ppc/mm/memory_hotplug.c:1084!
  LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=2048 NUMA pSeries
  Modules linked in: rpadlpar_io rpaphp pseries_rng rng_core vmx_crypto gf128mul binfmt_misc ip_tables x_tables xfs libcrc32c crc32c_vpmsum autofs4
  CPU: 8 PID: 10256 Comm: drmgr Not tainted 5.9.0-rc1+ #25
  Call Trace:
    add_memory_resource+0x23c/0x340 (unreliable)
    __add_memory+0x5c/0xf0
    dlpar_add_lmb+0x1b4/0x500
    dlpar_memory+0x1f8/0xb80
    handle_dlpar_errorlog+0xc0/0x190
    dlpar_store+0x198/0x4a0
    kobj_attr_store+0x30/0x50
    sysfs_kf_write+0x64/0x90
    kernfs_fop_write+0x1b0/0x290
    vfs_write+0xe8/0x290
    ksys_write+0xdc/0x130
    system_call_exception+0x160/0x270
    system_call_common+0xf0/0x27c

This has been seen on PowerPC LPAR.

The root cause of this issue is that when node's memory is registered,
the range used can overlap another node's range, thus the memory block
is registered to multiple nodes in sysfs.

There are two issues here:

 (a) The sysfs memory and node's layouts are broken due to these
     multiple links

 (b) The link errors in link_mem_sections() should not lead to a system
     panic.

To address (a) register_mem_sect_under_node should not rely on the
system state to detect whether the link operation is triggered by a hot
plug operation or not.  This is addressed by the patches 1 and 2 of this
series.

Issue (b) will be addressed separately.

This patch (of 2):

The memmap_context enum is used to detect whether a memory operation is
due to a hot-add operation or happening at boot time.

Make it general to the hotplug operation and rename it as
meminit_context.

There is no functional change introduced by this patch
Suggested-by: NDavid Hildenbrand <david@redhat.com>
Signed-off-by: NLaurent Dufour <ldufour@linux.ibm.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Reviewed-by: NDavid Hildenbrand <david@redhat.com>
Reviewed-by: NOscar Salvador <osalvador@suse.de>
Acked-by: NMichal Hocko <mhocko@suse.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: "Rafael J . Wysocki" <rafael@kernel.org>
Cc: Nathan Lynch <nathanl@linux.ibm.com>
Cc: Scott Cheloha <cheloha@linux.ibm.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: <stable@vger.kernel.org>
Link: https://lkml.kernel.org/r/20200915094143.79181-1-ldufour@linux.ibm.com
Link: https://lkml.kernel.org/r/20200915132624.9723-1-ldufour@linux.ibm.comSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

c1d0da83

mm/gup: fix gup_fast with dynamic page table folding · d3f7b1bb

由 Vasily Gorbik 提交于 9月 25, 2020

Currently to make sure that every page table entry is read just once
gup_fast walks perform READ_ONCE and pass pXd value down to the next
gup_pXd_range function by value e.g.:

  static int gup_pud_range(p4d_t p4d, unsigned long addr, unsigned long end,
                           unsigned int flags, struct page **pages, int *nr)
  ...
          pudp = pud_offset(&p4d, addr);

This function passes a reference on that local value copy to pXd_offset,
and might get the very same pointer in return.  This happens when the
level is folded (on most arches), and that pointer should not be
iterated.

On s390 due to the fact that each task might have different 5,4 or
3-level address translation and hence different levels folded the logic
is more complex and non-iteratable pointer to a local copy leads to
severe problems.

Here is an example of what happens with gup_fast on s390, for a task
with 3-level paging, crossing a 2 GB pud boundary:

  // addr = 0x1007ffff000, end = 0x10080001000
  static int gup_pud_range(p4d_t p4d, unsigned long addr, unsigned long end,
                           unsigned int flags, struct page **pages, int *nr)
  {
        unsigned long next;
        pud_t *pudp;

        // pud_offset returns &p4d itself (a pointer to a value on stack)
        pudp = pud_offset(&p4d, addr);
        do {
                // on second iteratation reading "random" stack value
                pud_t pud = READ_ONCE(*pudp);

                // next = 0x10080000000, due to PUD_SIZE/MASK != PGDIR_SIZE/MASK on s390
                next = pud_addr_end(addr, end);
                ...
        } while (pudp++, addr = next, addr != end); // pudp++ iterating over stack

        return 1;
  }

This happens since s390 moved to common gup code with commit
d1874a0c ("s390/mm: make the pxd_offset functions more robust") and
commit 1a42010c ("s390/mm: convert to the generic
get_user_pages_fast code").

s390 tried to mimic static level folding by changing pXd_offset
primitives to always calculate top level page table offset in pgd_offset
and just return the value passed when pXd_offset has to act as folded.

What is crucial for gup_fast and what has been overlooked is that
PxD_SIZE/MASK and thus pXd_addr_end should also change correspondingly.
And the latter is not possible with dynamic folding.

To fix the issue in addition to pXd values pass original pXdp pointers
down to gup_pXd_range functions.  And introduce pXd_offset_lockless
helpers, which take an additional pXd entry value parameter.  This has
already been discussed in

  https://lkml.kernel.org/r/20190418100218.0a4afd51@mschwideX1

Fixes: 1a42010c ("s390/mm: convert to the generic get_user_pages_fast code")
Signed-off-by: NVasily Gorbik <gor@linux.ibm.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Reviewed-by: NGerald Schaefer <gerald.schaefer@linux.ibm.com>
Reviewed-by: NAlexander Gordeev <agordeev@linux.ibm.com>
Reviewed-by: NJason Gunthorpe <jgg@nvidia.com>
Reviewed-by: NMike Rapoport <rppt@linux.ibm.com>
Reviewed-by: NJohn Hubbard <jhubbard@nvidia.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Jeff Dike <jdike@addtoit.com>
Cc: Richard Weinberger <richard@nod.at>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Claudio Imbrenda <imbrenda@linux.ibm.com>
Cc: <stable@vger.kernel.org>	[5.2+]
Link: https://lkml.kernel.org/r/patch.git-943f1e5dcff2.your-ad-here.call-01599856292-ext-8676@work.hoursSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

d3f7b1bb

21 9月, 2020 1 次提交

dax: Fix compilation for CONFIG_DAX && !CONFIG_FS_DAX · 88b67edd

由 Jan Kara 提交于 9月 21, 2020

dax_supported() is defined whenever CONFIG_DAX is enabled. So dummy
implementation should be defined only in !CONFIG_DAX case, not in
!CONFIG_FS_DAX case.

Fixes: e2ec5128 ("dm: Call proper helper to determine dax support")
Cc: <stable@vger.kernel.org>
Reported-by: NGeert Uytterhoeven <geert@linux-m68k.org>
Reported-by: NNaresh Kamboju <naresh.kamboju@linaro.org>
Reported-by: Nkernel test robot <lkp@intel.com>
Signed-off-by: NJan Kara <jack@suse.cz>
Signed-off-by: NDan Williams <dan.j.williams@intel.com>

88b67edd

20 9月, 2020 3 次提交

dm: Call proper helper to determine dax support · e2ec5128

由 Jan Kara 提交于 9月 20, 2020

DM was calling generic_fsdax_supported() to determine whether a device
referenced in the DM table supports DAX. However this is a helper for "leaf" device drivers so that
they don't have to duplicate common generic checks. High level code
should call dax_supported() helper which that calls into appropriate
helper for the particular device. This problem manifested itself as
kernel messages:

dm-3: error: dax access failed (-95)

when lvm2-testsuite run in cases where a DM device was stacked on top of
another DM device.

Fixes: 7bf7eac8 ("dax: Arrange for dax_supported check to span multiple devices")
Cc: <stable@vger.kernel.org>
Tested-by: NAdrian Huang <ahuang12@lenovo.com>
Signed-off-by: NJan Kara <jack@suse.cz>
Acked-by: NMike Snitzer <snitzer@redhat.com>
Reported-by: Nkernel test robot <lkp@intel.com>
Link: https://lore.kernel.org/r/160061715195.13131.5503173247632041975.stgit@dwillia2-desk3.amr.corp.intel.comSigned-off-by: NDan Williams <dan.j.williams@intel.com>

e2ec5128

stackleak: let stack_erasing_sysctl take a kernel pointer buffer · 4773ef33

由 Tobias Klauser 提交于 9月 18, 2020

Commit 32927393 ("sysctl: pass kernel pointers to ->proc_handler")
changed ctl_table.proc_handler to take a kernel pointer. Adjust the
signature of stack_erasing_sysctl to match ctl_table.proc_handler which
fixes the following sparse warning:

kernel/stackleak.c:31:50: warning: incorrect type in argument 3 (different address spaces)
kernel/stackleak.c:31:50: expected void *
kernel/stackleak.c:31:50: got void [noderef] __user *buffer

Fixes: 32927393 ("sysctl: pass kernel pointers to ->proc_handler")
Signed-off-by: NTobias Klauser <tklauser@distanz.ch>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Link: https://lkml.kernel.org/r/20200907093253.13656-1-tklauser@distanz.chSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

4773ef33

ftrace: let ftrace_enable_sysctl take a kernel pointer buffer · 7bb82ac3

由 Tobias Klauser 提交于 9月 18, 2020

Commit 32927393 ("sysctl: pass kernel pointers to ->proc_handler")
changed ctl_table.proc_handler to take a kernel pointer. Adjust the
signature of ftrace_enable_sysctl to match ctl_table.proc_handler which
fixes the following sparse warning:

kernel/trace/ftrace.c:7544:43: warning: incorrect type in argument 3 (different address spaces)
kernel/trace/ftrace.c:7544:43: expected void *
kernel/trace/ftrace.c:7544:43: got void [noderef] __user *buffer

Fixes: 32927393 ("sysctl: pass kernel pointers to ->proc_handler")
Signed-off-by: NTobias Klauser <tklauser@distanz.ch>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Link: https://lkml.kernel.org/r/20200907093207.13540-1-tklauser@distanz.chSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

7bb82ac3

19 9月, 2020 2 次提交

kprobes: tracing/kprobes: Fix to kill kprobes on initmem after boot · 82d083ab

由 Masami Hiramatsu 提交于 9月 10, 2020

Since kprobe_event= cmdline option allows user to put kprobes on the
functions in initmem, kprobe has to make such probes gone after boot.
Currently the probes on the init functions in modules will be handled
by module callback, but the kernel init text isn't handled.
Without this, kprobes may access non-exist text area to disable or
remove it.

Link: https://lkml.kernel.org/r/159972810544.428528.1839307531600646955.stgit@devnote2

Fixes: 970988e1 ("tracing/kprobe: Add kprobe_event= boot parameter")
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Shuah Khan <skhan@linuxfoundation.org>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: stable@vger.kernel.org
Signed-off-by: NMasami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org>

82d083ab

ftrace: Let ftrace_enable_sysctl take a kernel pointer buffer · 54fa9ba5

由 Tobias Klauser 提交于 9月 07, 2020

Commit 32927393 ("sysctl: pass kernel pointers to ->proc_handler")
changed ctl_table.proc_handler to take a kernel pointer. Adjust the
signature of ftrace_enable_sysctl to match ctl_table.proc_handler which
fixes the following sparse warning:

kernel/trace/ftrace.c:7544:43: warning: incorrect type in argument 3 (different address spaces)
kernel/trace/ftrace.c:7544:43:    expected void *
kernel/trace/ftrace.c:7544:43:    got void [noderef] __user *buffer

Link: https://lkml.kernel.org/r/20200907093207.13540-1-tklauser@distanz.ch

Fixes: 32927393 ("sysctl: pass kernel pointers to ->proc_handler")
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: NTobias Klauser <tklauser@distanz.ch>
Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org>

54fa9ba5

18 9月, 2020 2 次提交

mm: allow a controlled amount of unfairness in the page lock · 5ef64cc8

由 Linus Torvalds 提交于 9月 13, 2020

Commit 2a9127fc ("mm: rewrite wait_on_page_bit_common() logic") made
the page locking entirely fair, in that if a waiter came in while the
lock was held, the lock would be transferred to the lockers strictly in
order.

That was intended to finally get rid of the long-reported watchdog
failures that involved the page lock under extreme load, where a process
could end up waiting essentially forever, as other page lockers stole
the lock from under it.

It also improved some benchmarks, but it ended up causing huge
performance regressions on others, simply because fair lock behavior
doesn't end up giving out the lock as aggressively, causing better
worst-case latency, but potentially much worse average latencies and
throughput.

Instead of reverting that change entirely, this introduces a controlled
amount of unfairness, with a sysctl knob to tune it if somebody needs
to.  But the default value should hopefully be good for any normal load,
allowing a few rounds of lock stealing, but enforcing the strict
ordering before the lock has been stolen too many times.

There is also a hint from Matthieu Baerts that the fair page coloring
may end up exposing an ABBA deadlock that is hidden by the usual
optimistic lock stealing, and while the unfairness doesn't fix the
fundamental issue (and I'm still looking at that), it avoids it in
practice.

The amount of unfairness can be modified by writing a new value to the
'sysctl_page_lock_unfairness' variable (default value of 5, exposed
through /proc/sys/vm/page_lock_unfairness), but that is hopefully
something we'd use mainly for debugging rather than being necessary for
any deep system tuning.

This whole issue has exposed just how critical the page lock can be, and
how contended it gets under certain locks.  And the main contention
doesn't really seem to be anything related to IO (which was the origin
of this lock), but for things like just verifying that the page file
mapping is stable while faulting in the page into a page table.

Link: https://lore.kernel.org/linux-fsdevel/ed8442fd-6f54-dd84-cd4a-941e8b7ee603@MichaelLarabel.com/
Link: https://www.phoronix.com/scan.php?page=article&item=linux-50-59&num=1
Link: https://lore.kernel.org/linux-fsdevel/c560a38d-8313-51fb-b1ec-e904bd8836bc@tessares.net/Reported-and-tested-by: NMichael Larabel <Michael@michaellarabel.com>
Tested-by: NMatthieu Baerts <matthieu.baerts@tessares.net>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Chris Mason <clm@fb.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

5ef64cc8

arm64: paravirt: Initialize steal time when cpu is online · 75df529b

由 Andrew Jones 提交于 9月 16, 2020

Steal time initialization requires mapping a memory region which
invokes a memory allocation. Doing this at CPU starting time results
in the following trace when CONFIG_DEBUG_ATOMIC_SLEEP is enabled:

BUG: sleeping function called from invalid context at mm/slab.h:498
in_atomic(): 1, irqs_disabled(): 128, non_block: 0, pid: 0, name: swapper/1
CPU: 1 PID: 0 Comm: swapper/1 Not tainted 5.9.0-rc5+ #1
Call trace:
 dump_backtrace+0x0/0x208
 show_stack+0x1c/0x28
 dump_stack+0xc4/0x11c
 ___might_sleep+0xf8/0x130
 __might_sleep+0x58/0x90
 slab_pre_alloc_hook.constprop.101+0xd0/0x118
 kmem_cache_alloc_node_trace+0x84/0x270
 __get_vm_area_node+0x88/0x210
 get_vm_area_caller+0x38/0x40
 __ioremap_caller+0x70/0xf8
 ioremap_cache+0x78/0xb0
 memremap+0x9c/0x1a8
 init_stolen_time_cpu+0x54/0xf0
 cpuhp_invoke_callback+0xa8/0x720
 notify_cpu_starting+0xc8/0xd8
 secondary_start_kernel+0x114/0x180
CPU1: Booted secondary processor 0x0000000001 [0x431f0a11]

However we don't need to initialize steal time at CPU starting time.
We can simply wait until CPU online time, just sacrificing a bit of
accuracy by returning zero for steal time until we know better.

While at it, add __init to the functions that are only called by
pv_time_init() which is __init.
Signed-off-by: NAndrew Jones <drjones@redhat.com>
Fixes: e0685fa2 ("arm64: Retrieve stolen time as paravirtualized guest")
Cc: stable@vger.kernel.org
Reviewed-by: NSteven Price <steven.price@arm.com>
Link: https://lore.kernel.org/r/20200916154530.40809-1-drjones@redhat.comSigned-off-by: NCatalin Marinas <catalin.marinas@arm.com>

75df529b

17 9月, 2020 2 次提交

fs: fix cast in fsparam_u32hex() macro · ffbc3dd1

由 Alexey Dobriyan 提交于 9月 16, 2020

Signed-off-by: NAlexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

ffbc3dd1

cpuidle: Allow cpuidle drivers to take over RCU-idle · 8747f202

由 Peter Zijlstra 提交于 9月 15, 2020

Some drivers have to do significant work, some of which relies on RCU
still being active. Instead of using RCU_NONIDLE in the drivers and
flipping RCU back on, allow drivers to take over RCU-idle duty.
Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: NUlf Hansson <ulf.hansson@linaro.org>
Tested-by: NBorislav Petkov <bp@suse.de>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

8747f202

16 9月, 2020 2 次提交

locking/percpu-rwsem: Use this_cpu_{inc,dec}() for read_count · e6b1a44e

由 Hou Tao 提交于 9月 15, 2020

The __this_cpu*() accessors are (in general) IRQ-unsafe which, given
that percpu-rwsem is a blocking primitive, should be just fine.

However, file_end_write() is used from IRQ context and will cause
load-store issues on architectures where the per-cpu accessors are not
natively irq-safe.

Fix it by using the IRQ-safe this_cpu_*() for operations on
read_count. This will generate more expensive code on a number of
platforms, which might cause a performance regression for some of the
other percpu-rwsem users.

If any such is reported, we can consider alternative solutions.

Fixes: 70fe2f48 ("aio: fix freeze protection of aio writes")
Signed-off-by: NHou Tao <houtao1@huawei.com>
Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: NWill Deacon <will@kernel.org>
Acked-by: NOleg Nesterov <oleg@redhat.com>
Link: https://lkml.kernel.org/r/20200915140750.137881-1-houtao1@huawei.com

e6b1a44e

serial: core: fix console port-lock regression · e0830dbf

由 Johan Hovold 提交于 9月 09, 2020

Fix the port-lock initialisation regression introduced by commit
a3cb39d2 ("serial: core: Allow detach and attach serial device for
console") by making sure that the lock is again initialised during
console setup.

The console may be registered before the serial controller has been
probed in which case the port lock needs to be initialised during
console setup by a call to uart_set_options(). The console-detach
changes introduced a regression in several drivers by effectively
removing that initialisation by not initialising the lock when the port
is used as a console (which is always the case during console setup).

Add back the early lock initialisation and instead use a new
console-reinit flag to handle the case where a console is being
re-attached through sysfs.

The question whether the console-detach interface should have been added
in the first place is left for another discussion.

Note that the console-enabled check in uart_set_options() is not
redundant because of kgdboc, which can end up reinitialising an already
enabled console (see commit 42b6a1ba ("serial_core: Don't
re-initialize a previously initialized spinlock.")).

Fixes: a3cb39d2 ("serial: core: Allow detach and attach serial device for console")
Cc: stable <stable@vger.kernel.org> # 5.7
Signed-off-by: NJohan Hovold <johan@kernel.org>
Reviewed-by: NAndy Shevchenko <andriy.shevchenko@linux.intel.com>
Link: https://lore.kernel.org/r/20200909143101.15389-3-johan@kernel.orgSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

e0830dbf

11 9月, 2020 4 次提交

net: Fix broken NETIF_F_CSUM_MASK spell in netdev_features.h · 83896b0b

由 Miaohe Lin 提交于 9月 10, 2020

Remove the weird space inside the NETIF_F_CSUM_MASK.
Signed-off-by: NMiaohe Lin <linmiaohe@huawei.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

83896b0b

powercap: make documentation reflect code · cc88b78c

由 Amit Kucheria 提交于 9月 10, 2020

Fix up the documentation of the struct powercap_control_type members
to match the code.

Also fixup stray whitespace.
Signed-off-by: NAmit Kucheria <amitk@kernel.org>
[ rjw: Changelog edits ]
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

cc88b78c

PM: <linux/device.h>: fix @em_pd kernel-doc warning · 95035eac

由 Randy Dunlap 提交于 9月 06, 2020

Fix kernel-doc warning in <linux/device.h>:

../include/linux/device.h:613: warning: Function parameter or member 'em_pd' not described in 'device'

Fixes: 1bc138c6 ("PM / EM: add support for other devices than CPUs in Energy Model")
Signed-off-by: NRandy Dunlap <rdunlap@infradead.org>
Reviewed-by: NLukasz Luba <lukasz.luba@arm.com>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

95035eac

test_firmware: Test platform fw loading on non-EFI systems · baaabecf

由 Kees Cook 提交于 9月 09, 2020

On non-EFI systems, it wasn't possible to test the platform firmware
loader because it will have never set "checked_fw" during __init.
Instead, allow the test code to override this check. Additionally split
the declarations into a private symbol namespace so there is greater
enforcement of the symbol visibility.

Fixes: 548193cb ("test_firmware: add support for firmware_request_platform")
Cc: stable@vger.kernel.org
Signed-off-by: NKees Cook <keescook@chromium.org>
Acked-by: NArd Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20200909225354.3118328-1-keescook@chromium.orgSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

baaabecf

10 9月, 2020 2 次提交

net: qed: Disable aRFS for NPAR and 100G · 2d2fe843

由 Dmitry Bogdanov 提交于 9月 09, 2020

In CMT and NPAR the PF is unknown when the GFS block processes the
packet. Therefore cannot use searcher as it has a per PF database,
and thus ARFS must be disabled.

Fixes: d51e4af5 ("qed: aRFS infrastructure support")
Signed-off-by: NManish Chopra <manishc@marvell.com>
Signed-off-by: NIgor Russkikh <irusskikh@marvell.com>
Signed-off-by: NMichal Kalderon <michal.kalderon@marvell.com>
Signed-off-by: NDmitry Bogdanov <dbogdanov@marvell.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

2d2fe843

net: add __must_check to skb_put_padto() · 4a009cb0

由 Eric Dumazet 提交于 9月 09, 2020

skb_put_padto() and __skb_put_padto() callers
must check return values or risk use-after-free.
Signed-off-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

4a009cb0

09 9月, 2020 1 次提交

i2c: algo: pca: Reapply i2c bus settings after reset · 0a355aeb

由 Evan Nimmo 提交于 9月 09, 2020

If something goes wrong (such as the SCL being stuck low) then we need
to reset the PCA chip. The issue with this is that on reset we lose all
config settings and the chip ends up in a disabled state which results
in a lock up/high CPU usage. We need to re-apply any configuration that
had previously been set and re-enable the chip.
Signed-off-by: NEvan Nimmo <evan.nimmo@alliedtelesis.co.nz>
Reviewed-by: NChris Packham <chris.packham@alliedtelesis.co.nz>
Reviewed-by: NAndy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: NWolfram Sang <wsa@kernel.org>

0a355aeb

08 9月, 2020 2 次提交

netdevice.h: fix xdp_state kernel-doc warning · ffa59b0b

由 Randy Dunlap 提交于 9月 06, 2020

Fix kernel-doc warning in <linux/netdevice.h>:

../include/linux/netdevice.h:2158: warning: Function parameter or member 'xdp_state' not described in 'net_device'

Fixes: 7f0a8382 ("bpf, xdp: Maintain info on attached XDP BPF programs in net_device")
Signed-off-by: NRandy Dunlap <rdunlap@infradead.org>
Cc: Andrii Nakryiko <andriin@fb.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: NJakub Kicinski <kuba@kernel.org>

ffa59b0b

netdevice.h: fix proto_down_reason kernel-doc warning · eb02d39a

由 Randy Dunlap 提交于 9月 06, 2020

Fix kernel-doc warning in <linux/netdevice.h>:

../include/linux/netdevice.h:2158: warning: Function parameter or member 'proto_down_reason' not described in 'net_device'

Fixes: 829eb208 ("rtnetlink: add support for protodown reason")
Signed-off-by: NRandy Dunlap <rdunlap@infradead.org>
Acked-by: NRoopa Prabhu <roopa@cumulusnetworks.com>
Signed-off-by: NJakub Kicinski <kuba@kernel.org>

eb02d39a

07 9月, 2020 1 次提交

PM: <linux/device.h>: fix @em_pd kernel-doc warning · 1c304748

由 Randy Dunlap 提交于 9月 06, 2020

Fix kernel-doc warning in <linux/device.h>:

../include/linux/device.h:613: warning: Function parameter or member 'em_pd' not described in 'device'

Fixes: 1bc138c6 ("PM / EM: add support for other devices than CPUs in Energy Model")
Signed-off-by: NRandy Dunlap <rdunlap@infradead.org>
Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: NLukasz Luba <lukasz.luba@arm.com>
Link: https://lore.kernel.org/r/d97f40ad-3033-703a-c3cb-2843ce0f6371@infradead.orgSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

1c304748

06 9月, 2020 1 次提交

include/linux/log2.h: add missing () around n in roundup_pow_of_two() · 428fc0af

由 Jason Gunthorpe 提交于 9月 04, 2020

Otherwise gcc generates warnings if the expression is complicated.

Fixes: 312a0c17 ("[PATCH] LOG2: Alter roundup_pow_of_two() so that it can use a ilog2() on a constant")
Signed-off-by: NJason Gunthorpe <jgg@nvidia.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Link: https://lkml.kernel.org/r/0-v1-8a2697e3c003+41165-log_brackets_jgg@nvidia.comSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

428fc0af

05 9月, 2020 2 次提交

mm: Add PGREUSE counter · 798a6b87

由 Peter Xu 提交于 8月 21, 2020

This accounts for wp_page_reuse() case, where we reused a page for COW.
Signed-off-by: NPeter Xu <peterx@redhat.com>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

798a6b87

mm/ksm: Remove reuse_ksm_page() · 1a0cf263

由 Peter Xu 提交于 8月 21, 2020

Remove the function as the last reference has gone away with the do_wp_page()
changes.
Signed-off-by: NPeter Xu <peterx@redhat.com>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

1a0cf263

04 9月, 2020 3 次提交

dyndbg: refine export, rename to dynamic_debug_exec_queries() · a2d375ed

由 Jim Cromie 提交于 8月 31, 2020

commit 4c0d7782 ("dyndbg: export ddebug_exec_queries") had a few
problems:
 - broken non DYNAMIC_DEBUG_CORE configs, sparse warning
 - the exported function modifies query string, breaks on RO strings.
 - func name follows internal convention, shouldn't be exposed as is.

1st is fixed in header with ifdefd function prototype or stub defn.
Also remove an obsolete HAVE-symbol ifdef-comment, and add others.

Fix others by wrapping existing internal function with a new one,
named in accordance with module-prefix naming convention, before
export hits v5.9.0.  In new function, copy query string to a local
buffer, so users can pass hard-coded/RO queries, and internal function
can be used unchanged.

Fixes: 4c0d7782 ("dyndbg: export ddebug_exec_queries")
Signed-off-by: NJim Cromie <jim.cromie@gmail.com>
Link: https://lore.kernel.org/r/20200831182210.850852-3-jim.cromie@gmail.comSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

a2d375ed

x86/entry: Unbreak 32bit fast syscall · 4facb95b

由 Thomas Gleixner 提交于 9月 02, 2020

Andy reported that the syscall treacing for 32bit fast syscall fails:

# ./tools/testing/selftests/x86/ptrace_syscall_32
...
[RUN] SYSEMU
[FAIL] Initial args are wrong (nr=224, args=10 11 12 13 14 4289172732)
...
[RUN] SYSCALL
[FAIL] Initial args are wrong (nr=29, args=0 0 0 0 0 4289172732)
 
The eason is that the conversion to generic entry code moved the retrieval
of the sixth argument (EBP) after the point where the syscall entry work
runs, i.e. ptrace, seccomp, audit...

Unbreak it by providing a split up version of syscall_enter_from_user_mode().

- syscall_enter_from_user_mode_prepare() establishes state and enables
  interrupts

- syscall_enter_from_user_mode_work() runs the entry work

Replace the call to syscall_enter_from_user_mode() in the 32bit fast
syscall C-entry with the split functions and stick the EBP retrieval
between them.

Fixes: 27d6b4d1 ("x86/entry: Use generic syscall entry function")
Reported-by: NAndy Lutomirski <luto@kernel.org>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/87k0xdjbtt.fsf@nanos.tec.linutronix.de

4facb95b

memremap: rename MEMORY_DEVICE_DEVDAX to MEMORY_DEVICE_GENERIC · 4533d3ae

由 Roger Pau Monne 提交于 9月 01, 2020

This is in preparation for the logic behind MEMORY_DEVICE_DEVDAX also
being used by non DAX devices.

No functional change intended.
Signed-off-by: NRoger Pau Monné <roger.pau@citrix.com>
Reviewed-by: NIra Weiny <ira.weiny@intel.com>
Acked-by: NAndrew Morton <akpm@linux-foundation.org>
Reviewed-by: NPankaj Gupta <pankaj.gupta.linux@gmail.com>
Link: https://lore.kernel.org/r/20200901083326.21264-3-roger.pau@citrix.comSigned-off-by: NJuergen Gross <jgross@suse.com>

4533d3ae

03 9月, 2020 2 次提交

block: allow for_each_bvec to support zero len bvec · 7e249690

由 Ming Lei 提交于 8月 17, 2020

Block layer usually doesn't support or allow zero-length bvec. Since
commit 1bdc76ae ("iov_iter: use bvec iterator to implement
iterate_bvec()"), iterate_bvec() switches to bvec iterator. However,
Al mentioned that 'Zero-length segments are not disallowed' in iov_iter.

Fixes for_each_bvec() so that it can move on after seeing one zero
length bvec.

Fixes: 1bdc76ae ("iov_iter: use bvec iterator to implement iterate_bvec()")
Reported-by: Nsyzbot <syzbot+61acc40a49a3e46e25ea@syzkaller.appspotmail.com>
Signed-off-by: NMing Lei <ming.lei@redhat.com>
Tested-by: NTetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: <stable@vger.kernel.org>
Link: https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg2262077.htmlSigned-off-by: NJens Axboe <axboe@kernel.dk>

7e249690

libata: implement ATA_HORKAGE_MAX_TRIM_128M and apply to Sandisks · 3b545563

由 Tejun Heo 提交于 9月 02, 2020

All three generations of Sandisk SSDs lock up hard intermittently.
Experiments showed that disabling NCQ lowered the failure rate significantly
and the kernel has been disabling NCQ for some models of SD7's and 8's,
which is obviously undesirable.

Karthik worked with Sandisk to root cause the hard lockups to trim commands
larger than 128M. This patch implements ATA_HORKAGE_MAX_TRIM_128M which
limits max trim size to 128M and applies it to all three generations of
Sandisk SSDs.
Signed-off-by: NTejun Heo <tj@kernel.org>
Cc: Karthik Shivaram <karthikgs@fb.com>
Cc: stable@vger.kernel.org
Signed-off-by: NJens Axboe <axboe@kernel.dk>

3b545563

01 9月, 2020 1 次提交

HID: core: Sanitize event code and type when mapping input · 35556bed

由 Marc Zyngier 提交于 9月 01, 2020

When calling into hid_map_usage(), the passed event code is
blindly stored as is, even if it doesn't fit in the associated bitmap.

This event code can come from a variety of sources, including devices
masquerading as input devices, only a bit more "programmable".

Instead of taking the event code at face value, check that it actually
fits the corresponding bitmap, and if it doesn't:
- spit out a warning so that we know which device is acting up
- NULLify the bitmap pointer so that we catch unexpected uses

Code paths that can make use of untrusted inputs can now check
that the mapping was indeed correct and bail out if not.

Cc: stable@vger.kernel.org
Signed-off-by: NMarc Zyngier <maz@kernel.org>
Signed-off-by: NBenjamin Tissoires <benjamin.tissoires@gmail.com>

35556bed

29 8月, 2020 3 次提交

sparse: use static inline for __chk_{user,io}_ptr() · e5fc436f

由 Luc Van Oostenryck 提交于 8月 28, 2020

__chk_user_ptr() & __chk_io_ptr() are dummy extern functions which
only exist to enforce the typechecking of __user or __iomem pointers
in macros when using sparse.

This typechecking is done by inserting a call to these functions.
But the presence of these calls can inhibit some simplifications
and so influence the result of sparse's analysis of context/locking.

Fix this by changing these calls into static inline calls with
an empty body.
Signed-off-by: NLuc Van Oostenryck <luc.vanoostenryck@gmail.com>
Signed-off-by: NMiguel Ojeda <miguel.ojeda.sandonis@gmail.com>

e5fc436f

kernel.h: Silence sparse warning in lower_32_bits · ef91bb19

由 Herbert Xu 提交于 8月 28, 2020

I keep getting sparse warnings in crypto such as:

  CHECK   drivers/crypto/ccree/cc_hash.c
   drivers/crypto/ccree/cc_hash.c:49:9: warning: cast truncates bits from constant value (47b5481dbefa4fa4 becomes befa4fa4)
   drivers/crypto/ccree/cc_hash.c:49:26: warning: cast truncates bits from constant value (db0c2e0d64f98fa7 becomes 64f98fa7)
   [.. many more ..]

This patch removes the warning by adding a mask to keep sparse
happy.
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

ef91bb19

netfilter: nfnetlink: nfnetlink_unicast() reports EAGAIN instead of ENOBUFS · ee921183

由 Pablo Neira Ayuso 提交于 8月 23, 2020

Frontend callback reports EAGAIN to nfnetlink to retry a command, this
is used to signal that module autoloading is required. Unfortunately,
nlmsg_unicast() reports EAGAIN in case the receiver socket buffer gets
full, so it enters a busy-loop.

This patch updates nfnetlink_unicast() to turn EAGAIN into ENOBUFS and
to use nlmsg_unicast(). Remove the flags field in nfnetlink_unicast()
since this is always MSG_DONTWAIT in the existing code which is exactly
what nlmsg_unicast() passes to netlink_unicast() as parameter.

Fixes: 96518518 ("netfilter: add nftables")
Reported-by: NPhil Sutter <phil@nwl.cc>
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

ee921183

27 8月, 2020 4 次提交

net: Fix some comments · 645f0897

由 Miaohe Lin 提交于 8月 27, 2020

Fix some comments, including wrong function name, duplicated word and so
on.
Signed-off-by: NMiaohe Lin <linmiaohe@huawei.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

645f0897

cpufreq: Use WARN_ON_ONCE() for invalid relation · 30b8e6b2

由 Viresh Kumar 提交于 8月 27, 2020

The relation can't be invalid here, so if it turns out to be invalid,
just WARN_ON_ONCE() and return 0.
Signed-off-by: NViresh Kumar <viresh.kumar@linaro.org>
[ rjw: Subject and changelog edits ]
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

30b8e6b2

Compiler Attributes: fix comment concerning GCC 4.6 · 5861af92

由 Luc Van Oostenryck 提交于 8月 25, 2020

GCC 4.6 is not supported anymore, so remove a reference to it,
leaving just the part about version prior GCC 5.
Signed-off-by: NLuc Van Oostenryck <luc.vanoostenryck@gmail.com>
Signed-off-by: NMiguel Ojeda <miguel.ojeda.sandonis@gmail.com>

5861af92

Compiler Attributes: remove comment about sparse not supporting __has_attribute · aac544c3

由 Luc Van Oostenryck 提交于 8月 25, 2020

Sparse supports __has_attribute() since 2018-08-31, so the comment
is not true anymore but more importantly is rather confusing.

So remove it.
Signed-off-by: NLuc Van Oostenryck <luc.vanoostenryck@gmail.com>
Signed-off-by: NMiguel Ojeda <miguel.ojeda.sandonis@gmail.com>

aac544c3

openeuler / Kernel 接近 2 年 前同步成功

openeuler / Kernel
接近 2 年前同步成功