提交 · c1ff3f95497e64b67230bddc1242f2d228880859 · openeuler / Kernel

14 10月, 2020 4 次提交

export.h: fix section name for CONFIG_TRIM_UNUSED_KSYMS for Clang · 4d6fb34a

由 Nick Desaulniers 提交于 10月 13, 2020

When enabling CONFIG_TRIM_UNUSED_KSYMS, the linker will warn about the
orphan sections:

(".discard.ksym") is being placed in '".discard.ksym"'

repeatedly when linking vmlinux.  This is because the stringification
operator, `#`, in the preprocessor escapes strings.  GCC and Clang differ
in how they treat section names that contain \".

The portable solution is to not use a string literal with the preprocessor
stringification operator.

Fixes: commit bbda5ec6 ("kbuild: simplify dependency generation for CONFIG_TRIM_UNUSED_KSYMS")
Reported-by: Nkbuild test robot <lkp@intel.com>
Suggested-by: NKees Cook <keescook@chromium.org>
Signed-off-by: NNick Desaulniers <ndesaulniers@google.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Reviewed-by: NKees Cook <keescook@chromium.org>
Cc: Nathan Chancellor <natechancellor@gmail.com>
Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
Cc: Matthias Maennich <maennich@google.com>
Cc: Jessica Yu <jeyu@kernel.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Will Deacon <will@kernel.org>
Link: https://bugs.llvm.org/show_bug.cgi?id=42950
Link: https://github.com/ClangBuiltLinux/linux/issues/1166
Link: https://lkml.kernel.org/r/20200929190701.398762-1-ndesaulniers@google.comSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

4d6fb34a

compiler.h: avoid escaped section names · a25c13b3

由 Nick Desaulniers 提交于 10月 13, 2020

The stringification operator, `#`, in the preprocessor escapes strings.
For example, `# "foo"` becomes `"\"foo\""`.  GCC and Clang differ in how
they treat section names that contain \".

The portable solution is to not use a string literal with the preprocessor
stringification operator.

In this case, since __section unconditionally uses the stringification
operator, we actually want the more verbose
__attribute__((__section__())).

Fixes: commit e04462fb ("Compiler Attributes: remove uses of __attribute__ from compiler.h")
Signed-off-by: NNick Desaulniers <ndesaulniers@google.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Cc: Miguel Ojeda <miguel.ojeda.sandonis@gmail.com>
Cc: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
Cc: Nathan Chancellor <natechancellor@gmail.com>
Cc: Arvind Sankar <nivedita@alum.mit.edu>
Link: https://bugs.llvm.org/show_bug.cgi?id=42950
Link: https://lkml.kernel.org/r/20200929194318.548707-1-ndesaulniers@google.comSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

a25c13b3

compiler-gcc: improve version error · c8db3b0a

由 Nick Desaulniers 提交于 10月 13, 2020

As Kees suggests, doing so provides developers with two useful pieces of
information:
- The kernel build was attempting to use GCC.
  (Maybe they accidentally poked the wrong configs in a CI.)
- They need 4.9 or better.
  ("Upgrade to what version?" doesn't need to be dug out of documentation,
   headers, etc.)
Suggested-by: NKees Cook <keescook@chromium.org>
Signed-off-by: NNick Desaulniers <ndesaulniers@google.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Tested-by: NSedat Dilek <sedat.dilek@gmail.com>
Reviewed-by: NKees Cook <keescook@chromium.org>
Reviewed-by: NMiguel Ojeda <miguel.ojeda.sandonis@gmail.com>
Reviewed-by: NNathan Chancellor <natechancellor@gmail.com>
Reviewed-by: NSedat Dilek <sedat.dilek@gmail.com>
Cc: Andrey Konovalov <andreyknvl@google.com>
Cc: Fangrui Song <maskray@google.com>
Cc: Marco Elver <elver@google.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Masahiro Yamada <masahiroy@kernel.org>
Cc: Vincenzo Frascino <vincenzo.frascino@arm.com>
Cc: Will Deacon <will@kernel.org>
Link: https://lkml.kernel.org/r/20200902225911.209899-8-ndesaulniers@google.comSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

c8db3b0a

compiler-clang: add build check for clang 10.0.1 · 1f7a44f6

由 Nick Desaulniers 提交于 10月 13, 2020

Patch series "set clang minimum version to 10.0.1", v3.

Adds a compile time #error to compiler-clang.h setting the effective
minimum supported version to clang 10.0.1.  A separate patch has already
been picked up into the Documentation/ tree also confirming the version.

Next are a series of reverts. One for 32b arm is a partial revert.

Then Marco suggested fixes to KASAN docs.

Finally, improve the warning for GCC too as per Kees.

This patch (of 7):

During Plumbers 2020, we voted to just support the latest release of Clang
for now.  Add a compile time check for this.

We plan to remove workarounds for older versions now, which will break in
subtle and not so subtle ways.
Suggested-by: NSedat Dilek <sedat.dilek@gmail.com>
Suggested-by: NNathan Chancellor <natechancellor@gmail.com>
Suggested-by: NKees Cook <keescook@chromium.org>
Signed-off-by: NNick Desaulniers <ndesaulniers@google.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Tested-by: NSedat Dilek <sedat.dilek@gmail.com>
Reviewed-by: NKees Cook <keescook@chromium.org>
Reviewed-by: NMiguel Ojeda <miguel.ojeda.sandonis@gmail.com>
Reviewed-by: NSedat Dilek <sedat.dilek@gmail.com>
Acked-by: NMarco Elver <elver@google.com>
Acked-by: NNathan Chancellor <natechancellor@gmail.com>
Acked-by: NSedat Dilek <sedat.dilek@gmail.com>
Cc: Andrey Konovalov <andreyknvl@google.com>
Cc: Fangrui Song <maskray@google.com>
Cc: Masahiro Yamada <masahiroy@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Will Deacon <will@kernel.org>
Cc: Vincenzo Frascino <vincenzo.frascino@arm.com>
Link: https://lkml.kernel.org/r/20200902225911.209899-1-ndesaulniers@google.com
Link: https://lkml.kernel.org/r/20200902225911.209899-2-ndesaulniers@google.com
Link: https://github.com/ClangBuiltLinux/linux/issues/9
Link: https://github.com/ClangBuiltLinux/linux/issues/941Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

1f7a44f6

12 10月, 2020 2 次提交

mm: khugepaged: recalculate min_free_kbytes after memory hotplug as expected by khugepaged · 4aab2be0

由 Vijay Balakrishna 提交于 10月 10, 2020

When memory is hotplug added or removed the min_free_kbytes should be
recalculated based on what is expected by khugepaged.  Currently after
hotplug, min_free_kbytes will be set to a lower default and higher
default set when THP enabled is lost.

This change restores min_free_kbytes as expected for THP consumers.

[vijayb@linux.microsoft.com: v5]
  Link: https://lkml.kernel.org/r/1601398153-5517-1-git-send-email-vijayb@linux.microsoft.com

Fixes: f000565a ("thp: set recommended min free kbytes")
Signed-off-by: NVijay Balakrishna <vijayb@linux.microsoft.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Reviewed-by: NPavel Tatashin <pasha.tatashin@soleen.com>
Acked-by: NMichal Hocko <mhocko@suse.com>
Cc: Allen Pais <apais@microsoft.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Song Liu <songliubraving@fb.com>
Cc: <stable@vger.kernel.org>
Link: https://lkml.kernel.org/r/1600305709-2319-2-git-send-email-vijayb@linux.microsoft.com
Link: https://lkml.kernel.org/r/1600204258-13683-1-git-send-email-vijayb@linux.microsoft.comSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

4aab2be0

mm: validate inode in mapping_set_error() · 8b7b2eb1

由 Minchan Kim 提交于 10月 10, 2020

The swap address_space doesn't have host. Thus, it makes kernel crash once
swap write meets error. Fix it.

Fixes: 735e4ae5 ("vfs: track per-sb writeback errors and report them to syncfs")
Signed-off-by: NMinchan Kim <minchan@kernel.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Acked-by: NJeff Layton <jlayton@kernel.org>
Cc: Jan Kara <jack@suse.cz>
Cc: Andres Freund <andres@anarazel.de>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Dave Chinner <david@fromorbit.com>
Cc: David Howells <dhowells@redhat.com>
Cc: <stable@vger.kernel.org>
Link: https://lkml.kernel.org/r/20201010000650.750063-1-minchan@kernel.orgSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

8b7b2eb1

10 10月, 2020 1 次提交

genirq/irqdomain: Allow partial trimming of irq_data hierarchy · 55567976

由 Marc Zyngier 提交于 10月 06, 2020

It appears that some HW is ugly enough that not all the interrupts
connected to a particular interrupt controller end up with the same
hierarchy depth (some of them are terminated early). This leaves
the irqchip hacker with only two choices, both equally bad:

- create discrete domain chains, one for each "hierarchy depth",
  which is very hard to maintain

- create fake hierarchy levels for the shallow paths, leading
  to all kind of problems (what are the safe hwirq values for these
  fake levels?)

Implement the ability to cut short a single interrupt hierarchy
from a level marked as being disconnected by using the new
irq_domain_disconnect_hierarchy() helper.

The irqdomain allocation code will then perform the trimming
Signed-off-by: NMarc Zyngier <maz@kernel.org>

55567976

09 10月, 2020 3 次提交

lockdep: Revert "lockdep: Use raw_cpu_*() for per-cpu variables" · baffd723

由 Peter Zijlstra 提交于 10月 05, 2020

The thinking in commit:

  fddf9055 ("lockdep: Use raw_cpu_*() for per-cpu variables")

is flawed. While it is true that when we're migratable both CPUs will
have a 0 value, it doesn't hold that when we do get migrated in the
middle of a raw_cpu_op(), the old CPU will still have 0 by the time we
get around to reading it on the new CPU.

Luckily, the reason for that commit (s390 using preempt_disable()
instead of preempt_disable_notrace() in their percpu code), has since
been fixed by commit:

  1196f12a ("s390: don't trace preemption in percpu macros")

An audit of arch/*/include/asm/percpu*.h shows there are no other
architectures affected by this particular issue.

Fixes: fddf9055 ("lockdep: Use raw_cpu_*() for per-cpu variables")
Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: NIngo Molnar <mingo@kernel.org>
Link: https://lkml.kernel.org/r/20201005095958.GJ2651@hirez.programming.kicks-ass.net

baffd723

lockdep: Fix lockdep recursion · 4d004099

由 Peter Zijlstra 提交于 10月 02, 2020

Steve reported that lockdep_assert*irq*(), when nested inside lockdep
itself, will trigger a false-positive.

One example is the stack-trace code, as called from inside lockdep,
triggering tracing, which in turn calls RCU, which then uses
lockdep_assert_irqs_disabled().

Fixes: a21ee605 ("lockdep: Change hardirq{s_enabled,_context} to per-cpu variables")
Reported-by: NSteven Rostedt <rostedt@goodmis.org>
Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: NIngo Molnar <mingo@kernel.org>

4d004099

lockdep: Fix usage_traceoverflow · 2bb8945b

由 Peter Zijlstra 提交于 9月 30, 2020

Basically print_lock_class_header()'s for loop is out of sync with the
the size of of ->usage_traces[].

Also clean things up a bit while at it, to avoid such mishaps in the future.

Fixes: 23870f12 ("locking/lockdep: Fix "USED" <- "IN-NMI" inversions")
Reported-by: NQian Cai <cai@redhat.com>
Debugged-by: NBoqun Feng <boqun.feng@gmail.com>
Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: NIngo Molnar <mingo@kernel.org>
Tested-by: NQian Cai <cai@redhat.com>
Link: https://lkml.kernel.org/r/20200930094937.GE2651@hirez.programming.kicks-ass.net

2bb8945b

08 10月, 2020 1 次提交

locking/seqlock: Tweak DEFINE_SEQLOCK() kernel doc · 24a18772

由 Sebastian Andrzej Siewior 提交于 9月 24, 2020

ctags creates a warning:
|ctags: Warning: include/linux/seqlock.h:738: null expansion of name pattern "\2"

The DEFINE_SEQLOCK() macro is passed to ctags and being told to expect
an argument.

Add a dummy argument to keep ctags quiet.
Signed-off-by: NSebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: NWill Deacon <will@kernel.org>
Link: https://lkml.kernel.org/r/20200924154851.skmswuyj322yuz4g@linutronix.de

24a18772

07 10月, 2020 4 次提交

block: soft limit zone-append sectors as well · fe6f0cdc

由 Johannes Thumshirn 提交于 10月 07, 2020

Martin rightfully noted that for normal filesystem IO we have soft limits
in place, to prevent them from getting too big and not lead to
unpredictable latencies. For zone append we only have the hardware limit
in place.

Cap the max sectors we submit via zone-append to the maximal number of
sectors if the second limit is lower.
Reported-by: NMartin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: NJohannes Thumshirn <johannes.thumshirn@wdc.com>
Reviewed-by: NDamien Le Moal <damien.lemoal@wdc.com>
Reviewed-by: NMartin K. Petersen <martin.petersen@oracle.com>
Link: https://lore.kernel.org/linux-btrfs/yq1k0w8g3rw.fsf@ca-mkp.ca.oracle.comSigned-off-by: NJens Axboe <axboe@kernel.dk>

fe6f0cdc

fs: remove no longer used dio_end_io() · c33fe275

由 Goldwyn Rodrigues 提交于 9月 24, 2020

Since we removed the last user of dio_end_io() when btrfs got converted
to iomap infrastructure ("btrfs: switch to iomap for direct IO"), remove
the helper function dio_end_io().
Reviewed-by: NNikolay Borisov <nborisov@suse.com>
Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NJosef Bacik <josef@toxicpanda.com>
Signed-off-by: NGoldwyn Rodrigues <rgoldwyn@suse.com>
Reviewed-by: NDavid Sterba <dsterba@suse.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

c33fe275

x86/mce: Recover from poison found while copying from user space · c0ab7ffc

由 Tony Luck 提交于 10月 06, 2020

Existing kernel code can only recover from a machine check on code that
is tagged in the exception table with a fault handling recovery path.

Add two new fields in the task structure to pass information from
machine check handler to the "task_work" that is queued to run before
the task returns to user mode:

+ mce_vaddr: will be initialized to the user virtual address of the fault
  in the case where the fault occurred in the kernel copying data from
  a user address.  This is so that kill_me_maybe() can provide that
  information to the user SIGBUS handler.

+ mce_kflags: copy of the struct mce.kflags needed by kill_me_maybe()
  to determine if mce_vaddr is applicable to this error.

Add code to recover from a machine check while copying data from user
space to the kernel. Action for this case is the same as if the user
touched the poison directly; unmap the page and send a SIGBUS to the task.

Use a new helper function to share common code between the "fault
in user mode" case and the "fault while copying from user" case.

New code paths will be activated by the next patch which sets
MCE_IN_KERNEL_COPYIN.
Suggested-by: NBorislav Petkov <bp@alien8.de>
Signed-off-by: NTony Luck <tony.luck@intel.com>
Signed-off-by: NBorislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20201006210910.21062-6-tony.luck@intel.com

c0ab7ffc

block: optimize blk_queue_zoned_model for !CONFIG_BLK_DEV_ZONED · 6fcd6695

由 Christoph Hellwig 提交于 8月 20, 2020

Always return BLK_ZONED_NONE if zoned device support is not enabled.
This allows various compiler optimizations including the dead code
elimination that we so like for avoiding ifdefs.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NKeith Busch <kbusch@kernel.org>
Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
Reviewed-by: NChaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Reviewed-by: NDamien Le Moal <damien.lemoal@wdc.com>

6fcd6695

06 10月, 2020 9 次提交

block: remove the unused blk_integrity_merge_bio export · d59da419

由 Christoph Hellwig 提交于 10月 06, 2020

Also move the definition from the public blkdev.h to the private
block/blk.h header.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

d59da419

block: remove the unused blk_integrity_merge_rq export · 92cf2fd1

由 Christoph Hellwig 提交于 10月 06, 2020

Also move the definition from the public blkdev.h to the private
block/blk.h header.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

92cf2fd1

block: move 'q_usage_counter' into front of 'request_queue' · 0549e87c

由 Ming Lei 提交于 10月 01, 2020

The field of 'q_usage_counter' is always fetched in fast path of every
block driver, and move it into front of 'request_queue', so it can be
fetched into 1st cacheline of 'request_queue' instance.
Signed-off-by: NMing Lei <ming.lei@redhat.com>
Tested-by: NVeronika Kabatova <vkabatov@redhat.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Cc: Sagi Grimberg <sagi@grimberg.me>
Cc: Tejun Heo <tj@kernel.org>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

0549e87c

percpu_ref: reduce memory footprint of percpu_ref in fast path · 2b0d3d3e

由 Ming Lei 提交于 10月 01, 2020

'struct percpu_ref' is often embedded into one user structure, and the
instance is usually referenced in fast path, however actually only
'percpu_count_ptr' is needed in fast path.

So move other fields into one new structure of 'percpu_ref_data', and
allocate it dynamically via kzalloc(), then memory footprint of
'percpu_ref' in fast path is reduced a lot and becomes suitable to put
into hot cacheline of user structure.
Signed-off-by: NMing Lei <ming.lei@redhat.com>
Tested-by: NVeronika Kabatova <vkabatov@redhat.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Acked-by: NTejun Heo <tj@kernel.org>
Cc: Sagi Grimberg <sagi@grimberg.me>
Cc: Tejun Heo <tj@kernel.org>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

2b0d3d3e

genirq/PM: Introduce IRQCHIP_ENABLE_WAKEUP_ON_SUSPEND flag · 90428a8e

由 Maulik Shah 提交于 9月 28, 2020

An interrupt that is disabled/masked but set for wakeup may still need to
be able to wake up the system from sleep states like "suspend to RAM".

To that effect, introduce the IRQCHIP_ENABLE_WAKEUP_ON_SUSPEND flag.
If the irqchip have this flag set, the irq PM code will enable/unmask
the irqs that are marked for wakeup, but that are in a disabled state.

On resume, such irqs will be restored back to their disabled state.
Suggested-by: NThomas Gleixner <tglx@linutronix.de>
Signed-off-by: NMaulik Shah <mkshah@codeaurora.org>
[maz: commit message fix-up]
Signed-off-by: NMarc Zyngier <maz@kernel.org>
Tested-by: NStephen Boyd <swboyd@chromium.org>
Reviewed-by: NThomas Gleixner <tglx@linutronix.de>
Reviewed-by: NDouglas Anderson <dianders@chromium.org>
Link: https://lore.kernel.org/r/1601267524-20199-4-git-send-email-mkshah@codeaurora.org

90428a8e

x86, powerpc: Rename memcpy_mcsafe() to copy_mc_to_{user, kernel}() · ec6347bb

由 Dan Williams 提交于 10月 05, 2020

In reaction to a proposal to introduce a memcpy_mcsafe_fast()
implementation Linus points out that memcpy_mcsafe() is poorly named
relative to communicating the scope of the interface. Specifically what
addresses are valid to pass as source, destination, and what faults /
exceptions are handled.

Of particular concern is that even though x86 might be able to handle
the semantics of copy_mc_to_user() with its common copy_user_generic()
implementation other archs likely need / want an explicit path for this
case:

  On Fri, May 1, 2020 at 11:28 AM Linus Torvalds <torvalds@linux-foundation.org> wrote:
  >
  > On Thu, Apr 30, 2020 at 6:21 PM Dan Williams <dan.j.williams@intel.com> wrote:
  > >
  > > However now I see that copy_user_generic() works for the wrong reason.
  > > It works because the exception on the source address due to poison
  > > looks no different than a write fault on the user address to the
  > > caller, it's still just a short copy. So it makes copy_to_user() work
  > > for the wrong reason relative to the name.
  >
  > Right.
  >
  > And it won't work that way on other architectures. On x86, we have a
  > generic function that can take faults on either side, and we use it
  > for both cases (and for the "in_user" case too), but that's an
  > artifact of the architecture oddity.
  >
  > In fact, it's probably wrong even on x86 - because it can hide bugs -
  > but writing those things is painful enough that everybody prefers
  > having just one function.

Replace a single top-level memcpy_mcsafe() with either
copy_mc_to_user(), or copy_mc_to_kernel().

Introduce an x86 copy_mc_fragile() name as the rename for the
low-level x86 implementation formerly named memcpy_mcsafe(). It is used
as the slow / careful backend that is supplanted by a fast
copy_mc_generic() in a follow-on patch.

One side-effect of this reorganization is that separating copy_mc_64.S
to its own file means that perf no longer needs to track dependencies
for its memcpy_64.S benchmarks.

 [ bp: Massage a bit. ]
Signed-off-by: NDan Williams <dan.j.williams@intel.com>
Signed-off-by: NBorislav Petkov <bp@suse.de>
Reviewed-by: NTony Luck <tony.luck@intel.com>
Acked-by: NMichael Ellerman <mpe@ellerman.id.au>
Cc: <stable@vger.kernel.org>
Link: http://lore.kernel.org/r/CAHk-=wjSqtXAqfUJxFtWNwmguFASTgB0dz1dT3V-78Quiezqbg@mail.gmail.com
Link: https://lkml.kernel.org/r/160195561680.2163339.11574962055305783722.stgit@dwillia2-desk3.amr.corp.intel.com

ec6347bb

regmap: irq: Add support to clear ack registers · 3a6f0fb7

由 Laxminath Kasam 提交于 10月 05, 2020

For particular codec HWs have requirement to toggle interrupt clear
register twice 0->1->0. To accommodate it, need to add one more field
(clear_ack) in the regmap_irq struct and update regmap-irq driver to
support it.
Signed-off-by: NLaxminath Kasam <lkasam@codeaurora.org>
Link: https://lore.kernel.org/r/1601907440-13373-1-git-send-email-lkasam@codeaurora.orgSigned-off-by: NMark Brown <broonie@kernel.org>

3a6f0fb7

block: make bio_crypt_clone() able to fail · 07560151

由 Eric Biggers 提交于 9月 15, 2020

bio_crypt_clone() assumes its gfp_mask argument always includes
__GFP_DIRECT_RECLAIM, so that the mempool_alloc() will always succeed.

However, bio_crypt_clone() might be called with GFP_ATOMIC via
setup_clone() in drivers/md/dm-rq.c, or with GFP_NOWAIT via
kcryptd_io_read() in drivers/md/dm-crypt.c.

Neither case is currently reachable with a bio that actually has an
encryption context.  However, it's fragile to rely on this.  Just make
bio_crypt_clone() able to fail, analogous to bio_integrity_clone().
Reported-by: NMiaohe Lin <linmiaohe@huawei.com>
Signed-off-by: NEric Biggers <ebiggers@google.com>
Reviewed-by: NMike Snitzer <snitzer@redhat.com>
Reviewed-by: NSatya Tangirala <satyat@google.com>
Cc: Satya Tangirala <satyat@google.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

07560151

block: add a bdget_part helper · 10ed1666

由 Christoph Hellwig 提交于 9月 25, 2020

All remaining callers of bdget() outside of fs/block_dev.c want to get a
reference to the struct block_device for a given struct hd_struct.  Add
a helper just for that and then mark bdget static.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

10ed1666

03 10月, 2020 13 次提交

sched/debug: Add new tracepoint to track cpu_capacity · 51cf18c9

由 Vincent Donnefort 提交于 8月 28, 2020

rq->cpu_capacity is a key element in several scheduler parts, such as EAS
task placement and load balancing. Tracking this value enables testing
and/or debugging by a toolkit.
Signed-off-by: NVincent Donnefort <vincent.donnefort@arm.com>
Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/1598605249-72651-1-git-send-email-vincent.donnefort@arm.com

51cf18c9

mm: remove compat_process_vm_{readv,writev} · c3973b40

由 Christoph Hellwig 提交于 9月 25, 2020

Now that import_iovec handles compat iovecs, the native syscalls
can be used for the compat case as well.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

c3973b40

fs: remove compat_sys_vmsplice · 598b3cec

由 Christoph Hellwig 提交于 9月 25, 2020

Now that import_iovec handles compat iovecs, the native vmsplice syscall
can be used for the compat case as well.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

598b3cec

fs: remove the compat readv/writev syscalls · 5f764d62

由 Christoph Hellwig 提交于 9月 25, 2020

Now that import_iovec handles compat iovecs, the native readv and writev
syscalls can be used for the compat case as well.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

5f764d62

fs: remove various compat readv/writev helpers · 3523a9d4

由 Christoph Hellwig 提交于 9月 25, 2020

Now that import_iovec handles compat iovecs as well, all the duplicated
code in the compat readv/writev helpers is not needed.  Remove them
and switch the compat syscall handlers to use the native helpers.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

3523a9d4

iov_iter: transparently handle compat iovecs in import_iovec · 89cd35c5

由 Christoph Hellwig 提交于 9月 25, 2020

Use in compat_syscall to import either native or the compat iovecs, and
remove the now superflous compat_import_iovec.

This removes the need for special compat logic in most callers, and
the remaining ones can still be simplified by using __import_iovec
with a bool compat parameter.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

89cd35c5

iov_iter: refactor rw_copy_check_uvector and import_iovec · bfdc5970

由 Christoph Hellwig 提交于 9月 25, 2020

Split rw_copy_check_uvector into two new helpers with more sensible
calling conventions:

 - iovec_from_user copies a iovec from userspace either into the provided
   stack buffer if it fits, or allocates a new buffer for it.  Returns
   the actually used iovec.  It also verifies that iov_len does fit a
   signed type, and handles compat iovecs if the compat flag is set.
 - __import_iovec consolidates the native and compat versions of
   import_iovec. It calls iovec_from_user, then validates each iovec
   actually points to user addresses, and ensures the total length
   doesn't overflow.

This has two major implications:

 - the access_process_vm case loses the total lenght checking, which
   wasn't required anyway, given that each call receives two iovecs
   for the local and remote side of the operation, and it verifies
   the total length on the local side already.
 - instead of a single loop there now are two loops over the iovecs.
   Given that the iovecs are cache hot this doesn't make a major
   difference
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

bfdc5970

net: introduce helper sendpage_ok() in include/linux/net.h · c381b079

由 Coly Li 提交于 10月 02, 2020

The original problem was from nvme-over-tcp code, who mistakenly uses
kernel_sendpage() to send pages allocated by __get_free_pages() without
__GFP_COMP flag. Such pages don't have refcount (page_count is 0) on
tail pages, sending them by kernel_sendpage() may trigger a kernel panic
from a corrupted kernel heap, because these pages are incorrectly freed
in network stack as page_count 0 pages.

This patch introduces a helper sendpage_ok(), it returns true if the
checking page,
- is not slab page: PageSlab(page) is false.
- has page refcount: page_count(page) is not zero

All drivers who want to send page to remote end by kernel_sendpage()
may use this helper to check whether the page is OK. If the helper does
not return true, the driver should try other non sendpage method (e.g.
sock_no_sendpage()) to handle the page.
Signed-off-by: NColy Li <colyli@suse.de>
Cc: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Hannes Reinecke <hare@suse.de>
Cc: Jan Kara <jack@suse.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Mikhail Skorzhinskii <mskorzhinskiy@solarflare.com>
Cc: Philipp Reisner <philipp.reisner@linbit.com>
Cc: Sagi Grimberg <sagi@grimberg.me>
Cc: Vlastimil Babka <vbabka@suse.com>
Cc: stable@vger.kernel.org
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c381b079

net: core: document two new elements of struct net_device · a93bdcb9

由 Mauro Carvalho Chehab 提交于 10月 02, 2020

As warned by "make htmldocs", there are two new struct elements
that aren't documented:

	../include/linux/netdevice.h:2159: warning: Function parameter or member 'unlink_list' not described in 'net_device'
	../include/linux/netdevice.h:2159: warning: Function parameter or member 'nested_level' not described in 'net_device'

Fixes: 1fc70edb ("net: core: add nested_level variable in net_device")
Signed-off-by: NMauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a93bdcb9

static_call: Fix return type of static_call_init · 69e0ad37

由 Nathan Chancellor 提交于 9月 28, 2020

Functions that are passed to early_initcall should be of type
initcall_t, which expects a return type of int. This is not currently an
error but a patch in the Clang LTO series could change that in the
future.

Fixes: 9183c3f9 ("static_call: Add inline static call infrastructure")
Signed-off-by: NNathan Chancellor <natechancellor@gmail.com>
Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: NSami Tolvanen <samitolvanen@google.com>
Link: https://lore.kernel.org/lkml/20200903203053.3411268-17-samitolvanen@google.com/

69e0ad37

net/mlx5: cmdif, Avoid skipping reclaim pages if FW is not accessible · b898ce7b

由 Saeed Mahameed 提交于 9月 11, 2020

In case of pci is offline reclaim_pages_cmd() will still try to call
the FW to release FW pages, cmd_exec() in this case will return a silent
success without actually calling the FW.

This is wrong and will cause page leaks, what we should do is to detect
pci offline or command interface un-available before tying to access the
FW and manually release the FW pages in the driver.

In this patch we share the code to check for FW command interface
availability and we call it in sensitive places e.g. reclaim_pages_cmd().

Alternative fix:
 1. Remove MLX5_CMD_OP_MANAGE_PAGES form mlx5_internal_err_ret_value,
    command success simulation list.
 2. Always Release FW pages even if cmd_exec fails in reclaim_pages_cmd().
Reviewed-by: NMoshe Shemesh <moshe@nvidia.com>
Signed-off-by: NSaeed Mahameed <saeedm@nvidia.com>

b898ce7b

net/mlx5: Avoid possible free of command entry while timeout comp handler · 50b2412b

由 Eran Ben Elisha 提交于 8月 04, 2020

Upon command completion timeout, driver simulates a forced command
completion. In a rare case where real interrupt for that command arrives
simultaneously, it might release the command entry while the forced
handler might still access it.

Fix that by adding an entry refcount, to track current amount of allowed
handlers. Command entry to be released only when this refcount is
decremented to zero.

Command refcount is always initialized to one. For callback commands,
command completion handler is the symmetric flow to decrement it. For
non-callback commands, it is wait_func().

Before ringing the doorbell, increment the refcount for the real completion
handler. Once the real completion handler is called, it will decrement it.

For callback commands, once the delayed work is scheduled, increment the
refcount. Upon callback command completion handler, we will try to cancel
the timeout callback. In case of success, we need to decrement the callback
refcount as it will never run.

In addition, gather the entry index free and the entry free into a one
flow for all command types release.

Fixes: e126ba97 ("mlx5: Add driver for Mellanox Connect-IB adapters")
Signed-off-by: NEran Ben Elisha <eranbe@mellanox.com>
Reviewed-by: NMoshe Shemesh <moshe@mellanox.com>
Signed-off-by: NSaeed Mahameed <saeedm@nvidia.com>

50b2412b

mm: memcg/slab: fix slab statistics in !SMP configuration · be458311

由 Roman Gushchin 提交于 10月 01, 2020

Since commit ea426c2a ("mm: memcg: prepare for byte-sized vmstat
items") the write side of slab counters accepts a value in bytes and
converts it to pages.  It happens in __mod_node_page_state().

However a non-SMP version of __mod_node_page_state() doesn't perform
this conversion.  It leads to incorrect (unrealistically high) slab
counters values.  Fix this by adding a similar conversion to the non-SMP
version of __mod_node_page_state().
Signed-off-by: NRoman Gushchin <guro@fb.com>
Reported-and-tested-by: NBastian Bittorf <bb@npl.de>
Fixes: ea426c2a ("mm: memcg: prepare for byte-sized vmstat items")
Acked-by: NVlastimil Babka <vbabka@suse.cz>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

be458311

02 10月, 2020 2 次提交

pipe: remove pipe_wait() and fix wakeup race with splice · 472e5b05

由 Linus Torvalds 提交于 10月 01, 2020

The pipe splice code still used the old model of waiting for pipe IO by
using a non-specific "pipe_wait()" that waited for any pipe event to
happen, which depended on all pipe IO being entirely serialized by the
pipe lock. So by checking the state you were waiting for, and then
adding yourself to the wait queue before dropping the lock, you were
guaranteed to see all the wakeups.

Strictly speaking, the actual wakeups were not done under the lock, but
the pipe_wait() model still worked, because since the waiter held the
lock when checking whether it should sleep, it would always see the
current state, and the wakeup was always done after updating the state.

However, commit 0ddad21d ("pipe: use exclusive waits when reading or
writing") split the single wait-queue into two, and in the process also
made the "wait for event" code wait for _two_ wait queues, and that then
showed a race with the wakers that were not serialized by the pipe lock.

It's only splice that used that "pipe_wait()" model, so the problem
wasn't obvious, but Josef Bacik reports:

"I hit a hang with fstest btrfs/187, which does a btrfs send into
/dev/null. This works by creating a pipe, the write side is given to
the kernel to write into, and the read side is handed to a thread that
splices into a file, in this case /dev/null.

The box that was hung had the write side stuck here [pipe_write] and
the read side stuck here [splice_from_pipe_next -> pipe_wait].

[ more details about pipe_wait() scenario ]

The problem is we're doing the prepare_to_wait, which sets our state
each time, however we can be woken up either with reads or writes. In
the case above we race with the WRITER waking us up, and re-set our
state to INTERRUPTIBLE, and thus never break out of schedule"

Josef had a patch that avoided the issue in pipe_wait() by just making
it set the state only once, but the deeper problem is that pipe_wait()
depends on a level of synchonization by the pipe mutex that it really
shouldn't. And the whole "wait for any pipe state change" model really
isn't very good to begin with.

So rather than trying to work around things in pipe_wait(), remove that
legacy model of "wait for arbitrary pipe event" entirely, and actually
create functions that wait for the pipe actually being readable or
writable, and can do so without depending on the pipe lock serializing
everything.

Fixes: 0ddad21d ("pipe: use exclusive waits when reading or writing")
Link: https://lore.kernel.org/linux-fsdevel/bfa88b5ad6f069b2b679316b9e495a970130416c.1601567868.git.josef@toxicpanda.com/Reported-by: NJosef Bacik <josef@toxicpanda.com>
Reviewed-and-tested-by: NJosef Bacik <josef@toxicpanda.com>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

472e5b05

regulator: qcom_smd: Add PM660/PM660L regulator support · 6d849653

由 AngeloGioacchino Del Regno 提交于 9月 26, 2020

The PM660 and PM660L are a very very common PMIC combo, found on
boards using the SDM630, SDM636, SDM660 (and SDA variants) SoC.

PM660 provides 6 SMPS and 19 LDOs (of which one is unaccesible),
while PM660L provides 5 SMPS (of which S3 and S4 are combined),
10 LDOs and a Buck-or-Boost (BoB) regulator.

The PM660L IC also provides other regulators that are very
specialized (for example, for the display) and will be managed
in the other appropriate drivers (for example, labibb).
Signed-off-by: NAngeloGioacchino Del Regno <kholk11@gmail.com>
Link: https://lore.kernel.org/r/20200926125549.13191-6-kholk11@gmail.comSigned-off-by: NMark Brown <broonie@kernel.org>

6d849653

01 10月, 2020 1 次提交

debugobjects: Free per CPU pool after CPU unplug · 88451f2c

由 Zqiang 提交于 9月 08, 2020

If a CPU is offlined the debug objects per CPU pool is not cleaned up. If
the CPU is never onlined again then the objects in the pool are wasted.

Add a CPU hotplug callback which is invoked after the CPU is dead to free
the pool.

[ tglx: Massaged changelog and added comment about remote access safety ]
Signed-off-by: NZqiang <qiang.zhang@windriver.com>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Cc: Waiman Long <longman@redhat.com>
Link: https://lore.kernel.org/r/20200908062709.11441-1-qiang.zhang@windriver.com

88451f2c

openeuler / Kernel 接近 2 年 前同步成功

openeuler / Kernel
接近 2 年前同步成功