提交 · c3d9cdca73d0e49f01a71cdc477a09b04b1b30fc · openeuler / Kernel

03 11月, 2020 4 次提交

H
s390: update defconfigs · c3d9cdca
由 Heiko Carstens 提交于 10月 26, 2020
```
Signed-off-by: NHeiko Carstens <hca@linux.ibm.com>
```
c3d9cdca
H
s390/vdso: remove unused constants · cfef9aa6
由 Heiko Carstens 提交于 10月 26, 2020
```
Signed-off-by: NHeiko Carstens <hca@linux.ibm.com>
```
cfef9aa6
H
s390/vdso: remove empty unused file · e9919866
由 Heiko Carstens 提交于 10月 26, 2020
```
Signed-off-by: NHeiko Carstens <hca@linux.ibm.com>
```
e9919866

s390/mm: make pmd/pud_deref() large page aware · b0e98aa9

由 Gerald Schaefer 提交于 10月 20, 2020

pmd/pud_deref() assume that they will never operate on large pmd/pud
entries, and therefore only use the non-large _xxx_ENTRY_ORIGIN mask.
With commit 9ec8fa8d ("s390/vmemmap: extend modify_pagetable()
to handle vmemmap"), that assumption is no longer true, at least for
pmd_deref().

In theory, we could end up with wrong addresses because some of the
non-address bits of a large entry would not be masked out.
In practice, this does not (yet) show any impact, because vmemmap_free()
is currently never used for s390.

Fix pmd/pud_deref() to check for the entry type and use the
_xxx_ENTRY_ORIGIN_LARGE mask for large entries.

While at it, also move pmd/pud_pfn() around, in order to avoid code
duplication, because they do the same thing.

Fixes: 9ec8fa8d ("s390/vmemmap: extend modify_pagetable() to handle vmemmap")
Cc: <stable@vger.kernel.org> # 5.9
Signed-off-by: NGerald Schaefer <gerald.schaefer@linux.ibm.com>
Reviewed-by: NAlexander Gordeev <agordeev@linux.ibm.com>
Signed-off-by: NHeiko Carstens <hca@linux.ibm.com>

b0e98aa9

26 10月, 2020 2 次提交

s390: correct __bootdata / __bootdata_preserved macros · 8e90b4b1

由 Vasily Gorbik 提交于 10月 26, 2020

Currently s390 build is broken.

  SECTCMP .boot.data
error: section .boot.data differs between vmlinux and arch/s390/boot/compressed/vmlinux
make[2]: *** [arch/s390/boot/section_cmp.boot.data] Error 1
  SECTCMP .boot.preserved.data
error: section .boot.preserved.data differs between vmlinux and arch/s390/boot/compressed/vmlinux
make[2]: *** [arch/s390/boot/section_cmp.boot.preserved.data] Error 1
make[1]: *** [bzImage] Error 2

Commit 33def849 ("treewide: Convert macro and uses of __section(foo)
to __section("foo")") converted all __section(foo) to __section("foo").
This is wrong for __bootdata / __bootdata_preserved macros which want
variable names to be a part of intermediate section names .boot.data.<var
name> and .boot.preserved.data.<var name>. Those sections are later
sorted by alignment + name and merged together into final .boot.data
/ .boot.preserved.data sections. Those sections must be identical in
the decompressor and the decompressed kernel (that is checked during
the build).

Fixes: 33def849 ("treewide: Convert macro and uses of __section(foo) to __section("foo")")
Signed-off-by: NVasily Gorbik <gor@linux.ibm.com>
Signed-off-by: NHeiko Carstens <hca@linux.ibm.com>

8e90b4b1

treewide: Convert macro and uses of __section(foo) to __section("foo") · 33def849

由 Joe Perches 提交于 10月 21, 2020

Use a more generic form for __section that requires quotes to avoid
complications with clang and gcc differences.

Remove the quote operator # from compiler_attributes.h __section macro.

Convert all unquoted __section(foo) uses to quoted __section("foo").
Also convert __attribute__((section("foo"))) uses to __section("foo")
even if the __attribute__ has multiple list entry forms.

Conversion done using the script at:

https://lore.kernel.org/lkml/75393e5ddc272dc7403de74d645e6c6e0f4e70eb.camel@perches.com/2-convert_section.plSigned-off-by: NJoe Perches <joe@perches.com>
Reviewed-by: NNick Desaulniers <ndesaulniers@gooogle.com>
Reviewed-by: NMiguel Ojeda <ojeda@kernel.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

33def849

21 10月, 2020 1 次提交

s390: virtio: PV needs VIRTIO I/O device protection · 4ce1cf7b

由 Pierre Morel 提交于 9月 10, 2020

If protected virtualization is active on s390, VIRTIO has only retricted
access to the guest memory.
Define CONFIG_ARCH_HAS_RESTRICTED_VIRTIO_MEMORY_ACCESS and export
arch_has_restricted_virtio_memory_access to advertize VIRTIO if that's
the case.
Signed-off-by: NPierre Morel <pmorel@linux.ibm.com>
Reviewed-by: NCornelia Huck <cohuck@redhat.com>
Reviewed-by: NHalil Pasic <pasic@linux.ibm.com>
Link: https://lore.kernel.org/r/1599728030-17085-3-git-send-email-pmorel@linux.ibm.comSigned-off-by: NMichael S. Tsirkin <mst@redhat.com>
Acked-by: NChristian Borntraeger <borntraeger@de.ibm.com>

4ce1cf7b

19 10月, 2020 1 次提交

mm/madvise: introduce process_madvise() syscall: an external memory hinting API · ecb8ac8b

由 Minchan Kim 提交于 10月 17, 2020

There is usecase that System Management Software(SMS) want to give a
memory hint like MADV_[COLD|PAGEEOUT] to other processes and in the
case of Android, it is the ActivityManagerService.

The information required to make the reclaim decision is not known to the
app.  Instead, it is known to the centralized userspace
daemon(ActivityManagerService), and that daemon must be able to initiate
reclaim on its own without any app involvement.

To solve the issue, this patch introduces a new syscall
process_madvise(2).  It uses pidfd of an external process to give the
hint.  It also supports vector address range because Android app has
thousands of vmas due to zygote so it's totally waste of CPU and power if
we should call the syscall one by one for each vma.(With testing 2000-vma
syscall vs 1-vector syscall, it showed 15% performance improvement.  I
think it would be bigger in real practice because the testing ran very
cache friendly environment).

Another potential use case for the vector range is to amortize the cost
ofTLB shootdowns for multiple ranges when using MADV_DONTNEED; this could
benefit users like TCP receive zerocopy and malloc implementations.  In
future, we could find more usecases for other advises so let's make it
happens as API since we introduce a new syscall at this moment.  With
that, existing madvise(2) user could replace it with process_madvise(2)
with their own pid if they want to have batch address ranges support
feature.

ince it could affect other process's address range, only privileged
process(PTRACE_MODE_ATTACH_FSCREDS) or something else(e.g., being the same
UID) gives it the right to ptrace the process could use it successfully.
The flag argument is reserved for future use if we need to extend the API.

I think supporting all hints madvise has/will supported/support to
process_madvise is rather risky.  Because we are not sure all hints make
sense from external process and implementation for the hint may rely on
the caller being in the current context so it could be error-prone.  Thus,
I just limited hints as MADV_[COLD|PAGEOUT] in this patch.

If someone want to add other hints, we could hear the usecase and review
it for each hint.  It's safer for maintenance rather than introducing a
buggy syscall but hard to fix it later.

So finally, the API is as follows,

      ssize_t process_madvise(int pidfd, const struct iovec *iovec,
                unsigned long vlen, int advice, unsigned int flags);

    DESCRIPTION
      The process_madvise() system call is used to give advice or directions
      to the kernel about the address ranges from external process as well as
      local process. It provides the advice to address ranges of process
      described by iovec and vlen. The goal of such advice is to improve
      system or application performance.

      The pidfd selects the process referred to by the PID file descriptor
      specified in pidfd. (See pidofd_open(2) for further information)

      The pointer iovec points to an array of iovec structures, defined in
      <sys/uio.h> as:

        struct iovec {
            void *iov_base;         /* starting address */
            size_t iov_len;         /* number of bytes to be advised */
        };

      The iovec describes address ranges beginning at address(iov_base)
      and with size length of bytes(iov_len).

      The vlen represents the number of elements in iovec.

      The advice is indicated in the advice argument, which is one of the
      following at this moment if the target process specified by pidfd is
      external.

        MADV_COLD
        MADV_PAGEOUT

      Permission to provide a hint to external process is governed by a
      ptrace access mode PTRACE_MODE_ATTACH_FSCREDS check; see ptrace(2).

      The process_madvise supports every advice madvise(2) has if target
      process is in same thread group with calling process so user could
      use process_madvise(2) to extend existing madvise(2) to support
      vector address ranges.

    RETURN VALUE
      On success, process_madvise() returns the number of bytes advised.
      This return value may be less than the total number of requested
      bytes, if an error occurred. The caller should check return value
      to determine whether a partial advice occurred.

FAQ:

Q.1 - Why does any external entity have better knowledge?

Quote from Sandeep

"For Android, every application (including the special SystemServer)
are forked from Zygote.  The reason of course is to share as many
libraries and classes between the two as possible to benefit from the
preloading during boot.

After applications start, (almost) all of the APIs end up calling into
this SystemServer process over IPC (binder) and back to the
application.

In a fully running system, the SystemServer monitors every single
process periodically to calculate their PSS / RSS and also decides
which process is "important" to the user for interactivity.

So, because of how these processes start _and_ the fact that the
SystemServer is looping to monitor each process, it does tend to *know*
which address range of the application is not used / useful.

Besides, we can never rely on applications to clean things up
themselves.  We've had the "hey app1, the system is low on memory,
please trim your memory usage down" notifications for a long time[1].
They rely on applications honoring the broadcasts and very few do.

So, if we want to avoid the inevitable killing of the application and
restarting it, some way to be able to tell the OS about unimportant
memory in these applications will be useful.

- ssp

Q.2 - How to guarantee the race(i.e., object validation) between when
giving a hint from an external process and get the hint from the target
process?

process_madvise operates on the target process's address space as it
exists at the instant that process_madvise is called.  If the space
target process can run between the time the process_madvise process
inspects the target process address space and the time that
process_madvise is actually called, process_madvise may operate on
memory regions that the calling process does not expect.  It's the
responsibility of the process calling process_madvise to close this
race condition.  For example, the calling process can suspend the
target process with ptrace, SIGSTOP, or the freezer cgroup so that it
doesn't have an opportunity to change its own address space before
process_madvise is called.  Another option is to operate on memory
regions that the caller knows a priori will be unchanged in the target
process.  Yet another option is to accept the race for certain
process_madvise calls after reasoning that mistargeting will do no
harm.  The suggested API itself does not provide synchronization.  It
also apply other APIs like move_pages, process_vm_write.

The race isn't really a problem though.  Why is it so wrong to require
that callers do their own synchronization in some manner?  Nobody
objects to write(2) merely because it's possible for two processes to
open the same file and clobber each other's writes --- instead, we tell
people to use flock or something.  Think about mmap.  It never
guarantees newly allocated address space is still valid when the user
tries to access it because other threads could unmap the memory right
before.  That's where we need synchronization by using other API or
design from userside.  It shouldn't be part of API itself.  If someone
needs more fine-grained synchronization rather than process level,
there were two ideas suggested - cookie[2] and anon-fd[3].  Both are
applicable via using last reserved argument of the API but I don't
think it's necessary right now since we have already ways to prevent
the race so don't want to add additional complexity with more
fine-grained optimization model.

To make the API extend, it reserved an unsigned long as last argument
so we could support it in future if someone really needs it.

Q.3 - Why doesn't ptrace work?

Injecting an madvise in the target process using ptrace would not work
for us because such injected madvise would have to be executed by the
target process, which means that process would have to be runnable and
that creates the risk of the abovementioned race and hinting a wrong
VMA.  Furthermore, we want to act the hint in caller's context, not the
callee's, because the callee is usually limited in cpuset/cgroups or
even freezed state so they can't act by themselves quick enough, which
causes more thrashing/kill.  It doesn't work if the target process are
ptraced(e.g., strace, debugger, minidump) because a process can have at
most one ptracer.

[1] https://developer.android.com/topic/performance/memory"

[2] process_getinfo for getting the cookie which is updated whenever
    vma of process address layout are changed - Daniel Colascione -
    https://lore.kernel.org/lkml/20190520035254.57579-1-minchan@kernel.org/T/#m7694416fd179b2066a2c62b5b139b14e3894e224

[3] anonymous fd which is used for the object(i.e., address range)
    validation - Michal Hocko -
    https://lore.kernel.org/lkml/20200120112722.GY18451@dhcp22.suse.cz/

[minchan@kernel.org: fix process_madvise build break for arm64]
  Link: http://lkml.kernel.org/r/20200303145756.GA219683@google.com
[minchan@kernel.org: fix build error for mips of process_madvise]
  Link: http://lkml.kernel.org/r/20200508052517.GA197378@google.com
[akpm@linux-foundation.org: fix patch ordering issue]
[akpm@linux-foundation.org: fix arm64 whoops]
[minchan@kernel.org: make process_madvise() vlen arg have type size_t, per Florian]
[akpm@linux-foundation.org: fix i386 build]
[sfr@canb.auug.org.au: fix syscall numbering]
  Link: https://lkml.kernel.org/r/20200905142639.49fc3f1a@canb.auug.org.au
[sfr@canb.auug.org.au: madvise.c needs compat.h]
  Link: https://lkml.kernel.org/r/20200908204547.285646b4@canb.auug.org.au
[minchan@kernel.org: fix mips build]
  Link: https://lkml.kernel.org/r/20200909173655.GC2435453@google.com
[yuehaibing@huawei.com: remove duplicate header which is included twice]
  Link: https://lkml.kernel.org/r/20200915121550.30584-1-yuehaibing@huawei.com
[minchan@kernel.org: do not use helper functions for process_madvise]
  Link: https://lkml.kernel.org/r/20200921175539.GB387368@google.com
[akpm@linux-foundation.org: pidfd_get_pid() gained an argument]
[sfr@canb.auug.org.au: fix up for "iov_iter: transparently handle compat iovecs in import_iovec"]
  Link: https://lkml.kernel.org/r/20200928212542.468e1fef@canb.auug.org.auSigned-off-by: NMinchan Kim <minchan@kernel.org>
Signed-off-by: NYueHaibing <yuehaibing@huawei.com>
Signed-off-by: NStephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Reviewed-by: NSuren Baghdasaryan <surenb@google.com>
Reviewed-by: NVlastimil Babka <vbabka@suse.cz>
Acked-by: NDavid Rientjes <rientjes@google.com>
Cc: Alexander Duyck <alexander.h.duyck@linux.intel.com>
Cc: Brian Geffon <bgeffon@google.com>
Cc: Christian Brauner <christian@brauner.io>
Cc: Daniel Colascione <dancol@google.com>
Cc: Jann Horn <jannh@google.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Joel Fernandes <joel@joelfernandes.org>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: John Dias <joaodias@google.com>
Cc: Kirill Tkhai <ktkhai@virtuozzo.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Oleksandr Natalenko <oleksandr@redhat.com>
Cc: Sandeep Patil <sspatil@google.com>
Cc: SeongJae Park <sj38.park@gmail.com>
Cc: SeongJae Park <sjpark@amazon.de>
Cc: Shakeel Butt <shakeelb@google.com>
Cc: Sonny Rao <sonnyrao@google.com>
Cc: Tim Murray <timmurray@google.com>
Cc: Christian Brauner <christian.brauner@ubuntu.com>
Cc: Florian Weimer <fw@deneb.enyo.de>
Cc: <linux-man@vger.kernel.org>
Link: http://lkml.kernel.org/r/20200302193630.68771-3-minchan@kernel.org
Link: http://lkml.kernel.org/r/20200508183320.GA125527@google.com
Link: http://lkml.kernel.org/r/20200622192900.22757-4-minchan@kernel.org
Link: https://lkml.kernel.org/r/20200901000633.1920247-4-minchan@kernel.orgSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

ecb8ac8b

18 10月, 2020 1 次提交

tracehook: clear TIF_NOTIFY_RESUME in tracehook_notify_resume() · 3c532798

由 Jens Axboe 提交于 10月 03, 2020

All the callers currently do this, clean it up and move the clearing
into tracehook_notify_resume() instead.
Reviewed-by: NOleg Nesterov <oleg@redhat.com>
Reviewed-by: NThomas Gleixner <tglx@linutronix.de>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

3c532798

14 10月, 2020 3 次提交

arch, drivers: replace for_each_membock() with for_each_mem_range() · b10d6bca

由 Mike Rapoport 提交于 10月 13, 2020

There are several occurrences of the following pattern:

	for_each_memblock(memory, reg) {
		start = __pfn_to_phys(memblock_region_memory_base_pfn(reg);
		end = __pfn_to_phys(memblock_region_memory_end_pfn(reg));

		/* do something with start and end */
	}

Using for_each_mem_range() iterator is more appropriate in such cases and
allows simpler and cleaner code.

[akpm@linux-foundation.org: fix arch/arm/mm/pmsa-v7.c build]
[rppt@linux.ibm.com: mips: fix cavium-octeon build caused by memblock refactoring]
  Link: http://lkml.kernel.org/r/20200827124549.GD167163@linux.ibm.comSigned-off-by: NMike Rapoport <rppt@linux.ibm.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Baoquan He <bhe@redhat.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Daniel Axtens <dja@axtens.net>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Emil Renner Berthing <kernel@esmil.dk>
Cc: Hari Bathini <hbathini@linux.ibm.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Cc: Marek Szyprowski <m.szyprowski@samsung.com>
Cc: Max Filippov <jcmvbkbc@gmail.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Miguel Ojeda <miguel.ojeda.sandonis@gmail.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Stafford Horne <shorne@gmail.com>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will@kernel.org>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Link: https://lkml.kernel.org/r/20200818151634.14343-13-rppt@kernel.orgSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

b10d6bca

arch, mm: replace for_each_memblock() with for_each_mem_pfn_range() · c9118e6c

由 Mike Rapoport 提交于 10月 13, 2020

There are several occurrences of the following pattern:

	for_each_memblock(memory, reg) {
		start_pfn = memblock_region_memory_base_pfn(reg);
		end_pfn = memblock_region_memory_end_pfn(reg);

		/* do something with start_pfn and end_pfn */
	}

Rather than iterate over all memblock.memory regions and each time query
for their start and end PFNs, use for_each_mem_pfn_range() iterator to get
simpler and clearer code.
Signed-off-by: NMike Rapoport <rppt@linux.ibm.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Reviewed-by: NBaoquan He <bhe@redhat.com>
Acked-by: Miguel Ojeda <miguel.ojeda.sandonis@gmail.com>	[.clang-format]
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Daniel Axtens <dja@axtens.net>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Emil Renner Berthing <kernel@esmil.dk>
Cc: Hari Bathini <hbathini@linux.ibm.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Cc: Marek Szyprowski <m.szyprowski@samsung.com>
Cc: Max Filippov <jcmvbkbc@gmail.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Stafford Horne <shorne@gmail.com>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will@kernel.org>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Link: https://lkml.kernel.org/r/20200818151634.14343-12-rppt@kernel.orgSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

c9118e6c

memblock: make memblock_debug and related functionality private · 87c55870

由 Mike Rapoport 提交于 10月 13, 2020

The only user of memblock_dbg() outside memblock was s390 setup code and
it is converted to use pr_debug() instead.  This allows to stop exposing
memblock_debug and memblock_dbg() to the rest of the kernel.

[akpm@linux-foundation.org: make memblock_dbg() safer and neater]
Signed-off-by: NMike Rapoport <rppt@linux.ibm.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Reviewed-by: NBaoquan He <bhe@redhat.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Daniel Axtens <dja@axtens.net>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Emil Renner Berthing <kernel@esmil.dk>
Cc: Hari Bathini <hbathini@linux.ibm.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Cc: Marek Szyprowski <m.szyprowski@samsung.com>
Cc: Max Filippov <jcmvbkbc@gmail.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Miguel Ojeda <miguel.ojeda.sandonis@gmail.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Stafford Horne <shorne@gmail.com>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will@kernel.org>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Link: https://lkml.kernel.org/r/20200818151634.14343-10-rppt@kernel.orgSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

87c55870

10 10月, 2020 3 次提交

s390/uaccess: fix indentation · 10e5afb3

由 Heiko Carstens 提交于 10月 08, 2020

Signed-off-by: NHeiko Carstens <hca@linux.ibm.com>
Signed-off-by: NVasily Gorbik <gor@linux.ibm.com>

10e5afb3

s390/uaccess: add default cases for __put_user_fn()/__get_user_fn() · db527397

由 Heiko Carstens 提交于 10月 08, 2020

Add default cases for __put_user_fn()/__get_user_fn(). This doesn't
fix anything since the functions are only called with sane values.

However we get rid of smatch warnings:
./arch/s390/include/asm/uaccess.h:143 __get_user_fn() error: uninitialized symbol 'rc'.
Signed-off-by: NHeiko Carstens <hca@linux.ibm.com>
Signed-off-by: NVasily Gorbik <gor@linux.ibm.com>

db527397

s390/kprobes: move insn_page to text segment · b61e1f32

由 Heiko Carstens 提交于 9月 18, 2020

Move the in-kernel kprobes insn page to text segment. Rationale:
having that page in rw data segment is suboptimal, since as soon as a
kprobe is set, this will split the 1:1 kernel mapping for a single
page which get new permissions.

Note: there is always at least one kprobe present for the kretprobe
trampoline; so the mapping will always be split into smaller 4k
mappings because of this.

Moving the kprobes insn page into text segment makes sure that the
page is mapped RO/X in any case, and avoids that the 1:1 mapping is
split.

The kprobe insn_page is defined as a dummy function which is filled
with "br %r14" instructions.
Signed-off-by: NHeiko Carstens <hca@linux.ibm.com>
Signed-off-by: NVasily Gorbik <gor@linux.ibm.com>

b61e1f32

09 10月, 2020 2 次提交

kbuild: explicitly specify the build id style · a9684337

由 Bill Wendling 提交于 9月 22, 2020

ld's --build-id defaults to "sha1" style, while lld defaults to "fast".
The build IDs are very different between the two, which may confuse
programs that reference them.
Signed-off-by: NBill Wendling <morbo@google.com>
Acked-by: NDavid S. Miller <davem@davemloft.net>
Signed-off-by: NMasahiro Yamada <masahiroy@kernel.org>

a9684337

seccomp: Move config option SECCOMP to arch/Kconfig · 282a181b

由 YiFei Zhu 提交于 9月 24, 2020

In order to make adding configurable features into seccomp easier,
it's better to have the options at one single location, considering
especially that the bulk of seccomp code is arch-independent. An quick
look also show that many SECCOMP descriptions are outdated; they talk
about /proc rather than prctl.

As a result of moving the config option and keeping it default on,
architectures arm, arm64, csky, riscv, sh, and xtensa did not have SECCOMP
on by default prior to this and SECCOMP will be default in this change.

Architectures microblaze, mips, powerpc, s390, sh, and sparc have an
outdated depend on PROC_FS and this dependency is removed in this change.
Suggested-by: NJann Horn <jannh@google.com>
Link: https://lore.kernel.org/lkml/CAG48ez1YWz9cnp08UZgeieYRhHdqh-ch7aNwc4JRBnGyrmgfMg@mail.gmail.com/Signed-off-by: NYiFei Zhu <yifeifz2@illinois.edu>
[kees: added HAVE_ARCH_SECCOMP help text, tweaked wording]
Signed-off-by: NKees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/9ede6ef35c847e58d61e476c6a39540520066613.1600951211.git.yifeifz2@illinois.edu

282a181b

08 10月, 2020 5 次提交

s390/pci: track whether util_str is valid in the zpci_dev · 517fe298

由 Matthew Rosato 提交于 10月 07, 2020

We'll need to keep track of whether or not the byte string in util_str is
valid and thus needs to be passed to a vfio-pci passthrough device.
Signed-off-by: NMatthew Rosato <mjrosato@linux.ibm.com>
Acked-by: NNiklas Schnelle <schnelle@linux.ibm.com>
Acked-by: NChristian Borntraeger <borntraeger@de.ibm.com>
Acked-by: NCornelia Huck <cohuck@redhat.com>
Signed-off-by: NAlex Williamson <alex.williamson@redhat.com>

517fe298

s390/pci: stash version in the zpci_dev · dc8c638d

由 Matthew Rosato 提交于 10月 07, 2020

In preparation for passing the info on to vfio-pci devices, stash the
supported PCI version for the target device in the zpci_dev.
Signed-off-by: NMatthew Rosato <mjrosato@linux.ibm.com>
Acked-by: NNiklas Schnelle <schnelle@linux.ibm.com>
Acked-by: NChristian Borntraeger <borntraeger@de.ibm.com>
Acked-by: NCornelia Huck <cohuck@redhat.com>
Signed-off-by: NAlex Williamson <alex.williamson@redhat.com>

dc8c638d

s390/sie: fix typo in SIGP code description · eefc69a0

由 Julian Wiedmann 提交于 10月 02, 2020

s/ait address/at address
Signed-off-by: NJulian Wiedmann <jwi@linux.ibm.com>
Acked-by: NChristian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: NVasily Gorbik <gor@linux.ibm.com>

eefc69a0

s390/lib: fix kernel doc for memcmp() · 4aa32ee3

由 Julian Wiedmann 提交于 10月 02, 2020

s/count/n
Signed-off-by: NJulian Wiedmann <jwi@linux.ibm.com>
Acked-by: NChristian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: NVasily Gorbik <gor@linux.ibm.com>

4aa32ee3

s390/sclp: Add support for SCLP AP adapter config/deconfig · 0671cc10

由 Harald Freudenberger 提交于 7月 27, 2020

Add support for AP bus adapter config and deconfig to the sclp
core code. The code is statically build into the kernel when
ZCRYPT is configured either as module or with static support.

This is the base functionality for having configure/deconfigure
support in the AP bus and card code. Another patch will exploit
this soon.
Signed-off-by: NHarald Freudenberger <freude@linux.ibm.com>
Suggested-by: NPierre Morel <pmorel@linux.ibm.com>
Signed-off-by: NVasily Gorbik <gor@linux.ibm.com>

0671cc10

06 10月, 2020 2 次提交

dma-mapping: merge <linux/dma-contiguous.h> into <linux/dma-map-ops.h> · 0b1abd1f

由 Christoph Hellwig 提交于 9月 11, 2020

Merge dma-contiguous.h into dma-map-ops.h, after removing the comment
describing the contiguous allocator into kernel/dma/contigous.c.
Signed-off-by: NChristoph Hellwig <hch@lst.de>

0b1abd1f

dma-mapping: split <linux/dma-mapping.h> · 0a0f0d8b

由 Christoph Hellwig 提交于 9月 22, 2020

Split out all the bits that are purely for dma_map_ops implementations
and related code into a new <linux/dma-map-ops.h> header so that they
don't get pulled into all the drivers. That also means the architecture
specific <asm/dma-mapping.h> is not pulled in by <linux/dma-mapping.h>
any more, which leads to a missing includes that were pulled in by the
x86 or arm versions in a few not overly portable drivers.
Signed-off-by: NChristoph Hellwig <hch@lst.de>

0a0f0d8b

03 10月, 2020 3 次提交

mm: remove compat_process_vm_{readv,writev} · c3973b40

由 Christoph Hellwig 提交于 9月 25, 2020

Now that import_iovec handles compat iovecs, the native syscalls
can be used for the compat case as well.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

c3973b40

fs: remove compat_sys_vmsplice · 598b3cec

由 Christoph Hellwig 提交于 9月 25, 2020

Now that import_iovec handles compat iovecs, the native vmsplice syscall
can be used for the compat case as well.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

598b3cec

fs: remove the compat readv/writev syscalls · 5f764d62

由 Christoph Hellwig 提交于 9月 25, 2020

Now that import_iovec handles compat iovecs, the native readv and writev
syscalls can be used for the compat case as well.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

5f764d62

02 10月, 2020 7 次提交

s390/startup: correct early pgm check info formatting · 4ec95ed3

由 Vasily Gorbik 提交于 10月 01, 2020

Early sclp console messages are printed in line mode on z/VM and LPAR,
but under kvm newlines matter. Add a missing newline between "kernel
version" and "Kernel fault".
Signed-off-by: NVasily Gorbik <gor@linux.ibm.com>

4ec95ed3

s390: remove orphaned extern variables declarations · 100a980c

由 Vasily Gorbik 提交于 9月 27, 2020

arch/s390/kernel/entry.h: suspend_zero_pages - only declaration left
after commit 39421627 ("s390: remove broken hibernate / power
management support")

arch/s390/include/asm/setup.h: vmhalt_cmd - only declaration left after
commit 99ca4e58 ("[S390] kernel: Shutdown Actions Interface")

arch/s390/include/asm/setup.h: vmpoff_cmd - only declaration left after
commit 99ca4e58 ("[S390] kernel: Shutdown Actions Interface")
Signed-off-by: NVasily Gorbik <gor@linux.ibm.com>

100a980c

s390/kasan: make sure int handler always run with DAT on · 21a66717

由 Vasily Gorbik 提交于 9月 24, 2020

Since commit 998f5bbe ("s390/kasan: fix early pgm check handler
execution") early pgm check handler is executed with DAT on if Kasan
is enabled.

Still there is a window between setup_lowcore_dat_off() and
setup_lowcore_dat_on() when int handlers could be executed with DAT off
under Kasan. If this happens the kernel ends up in pgm check loop due
to Kasan shadow memory access attempts.

With Kasan enabled paging is initialized much earlier and DAT flag has to
be on at all times instrumented code is executed. Make sure int handlers
are set up to be called with DAT on right away in this case.
Signed-off-by: NVasily Gorbik <gor@linux.ibm.com>

21a66717

s390/ipl: add support to control memory clearing for nvme re-IPL · 5627b922

由 Gerald Schaefer 提交于 6月 23, 2020

Re-IPL for nvme is currently done by using diag 308 with the "Load Clear"
subcode, which means that all memory will be cleared.
This can increase re-IPL duration considerably on very large machines.

For list-directed IPL like nvme or fcp IPL, a "Load Normal" subcode was
introduced with z14. The "Load Normal" diag 308 subcode allows to re-IPL
without clearing memory.

This patch adds a new "clear" sysfs attribute to /sys/firmware/reipl/nvme,
which can be set to either "0" or "1" to disable or enable re-IPL with
memory clearing. The default value is "0", which disables memory clearing.
Signed-off-by: NGerald Schaefer <gerald.schaefer@de.ibm.com>
Reviewed-by: NVasily Gorbik <gor@linux.ibm.com>
Tested-by: NAlexander Egorenkov <egorenar@linux.ibm.com>
Signed-off-by: NVasily Gorbik <gor@linux.ibm.com>

5627b922

s390/nvme: support firmware-assisted dump to NVMe disks · bd37b368

由 Alexander Egorenkov 提交于 9月 29, 2020

From the kernel perspective NVMe dump works exactly like zFCP dump.
Therefore, adapt all places where code explicitly tests only for
IPL of type FCP DUMP. And also set the memory end correctly in this case.
Signed-off-by: NAlexander Egorenkov <egorenar@linux.ibm.com>
Reviewed-by: NVasily Gorbik <gor@linux.ibm.com>
Reviewed-by: NPhilipp Rudo <prudo@linux.ibm.com>
Signed-off-by: NVasily Gorbik <gor@linux.ibm.com>

bd37b368

s390/ipl: support NVMe IPL kernel parameters · d9f12e48

由 Alexander Egorenkov 提交于 9月 29, 2020

Enable extracting of extra kernel command-line parameters
from the NVMe IPL block passed by the firmware to the kernel
at boot.
Signed-off-by: NAlexander Egorenkov <egorenar@linux.ibm.com>
Reviewed-by: NVasily Gorbik <gor@linux.ibm.com>
Reviewed-by: NPhilipp Rudo <prudo@linux.ibm.com>
Signed-off-by: NVasily Gorbik <gor@linux.ibm.com>

d9f12e48

s390: nvme dump support · d70e38cb

由 Jason J. Herne 提交于 3月 17, 2020

Add the nvme dump ipl type, associated data, and sysfs entries. This allows
booting into a stand alone dump environment that resides on an nvme device.
Signed-off-by: NJason J. Herne <jjherne@linux.ibm.com>
Signed-off-by: NVasily Gorbik <gor@linux.ibm.com>

d70e38cb

30 9月, 2020 6 次提交

s390: remove orphaned function declarations · 402e9228

由 Vasily Gorbik 提交于 9月 27, 2020

arch/s390/pci/pci_bus.h: zpci_bus_init - only declaration left after
commit 05bc1be6 ("s390/pci: create zPCI bus")

arch/s390/include/asm/gmap.h: gmap_pte_notify - only declaration left
after commit 4be130a0 ("s390/mm: add shadow gmap support")

arch/s390/include/asm/pgalloc.h: rcu_table_freelist_finish - only
declaration left after commit 36409f63 ("[S390] use generic RCU
page-table freeing code")

arch/s390/include/asm/tlbflush.h: smp_ptlb_all - only declaration left
after commit 5a79859a ("s390: remove 31 bit support")

arch/s390/include/asm/vtimer.h: init_cpu_vtimer - only declaration left
after commit b5f87f15 ("s390/idle: consolidate idle functions and
definitions")

arch/s390/include/asm/pci.h: zpci_debug_info - only declaration left
after commit 386aa051 ("s390/pci: remove per device debug attribute")

arch/s390/include/asm/vdso.h: vdso_alloc_boot_cpu - only declaration
left after commit 4bff8cb5 ("s390: convert to GENERIC_VDSO")

arch/s390/include/asm/smp.h: smp_vcpu_scheduled - only declaration left
after commit 67626fad ("s390: enforce CONFIG_SMP")

arch/s390/kernel/entry.h: restart_call_handler - only declaration left
after commit 8b646bd7 ("[S390] rework smp code")

arch/s390/kernel/entry.h: startup_init_nobss - only declaration left
after commit 2e83e0eb ("s390: clean .bss before running uncompressed
kernel")

arch/s390/kernel/entry.h: s390_early_resume - only declaration left after
commit 39421627 ("s390: remove broken hibernate / power management
support")

drivers/s390/char/raw3270.h: raw3270_request_alloc_bootmem - only
declaration left after commit 33403dcf ("[S390] 3270 console:
convert from bootmem to slab")

drivers/s390/cio/device.h: ccw_device_schedule_sch_unregister - only
declaration left after commit 37de53bb ("[S390] cio: introduce ccw
device todos")

drivers/s390/char/tape.h: tape_hotplug_event - has only declaration
since recorded git history.

drivers/s390/char/tape.h: tape_oper_handler - has only declaration since
recorded git history.

drivers/s390/char/tape.h: tape_noper_handler - has only declaration
since recorded git history.

drivers/s390/char/tape_std.h: tape_std_check_locate - only declaration
left after commit 161beff8 ("s390/tape: remove tape block leftovers")

drivers/s390/char/tape_std.h: tape_std_default_handler - has only
declaration since recorded git history.

drivers/s390/char/tape_std.h: tape_std_unexpect_uchk_handler - has only
declaration since recorded git history.

drivers/s390/char/tape_std.h: tape_std_irq - has only declaration since
recorded git history.

drivers/s390/char/tape_std.h: tape_std_error_recovery - has only
declaration since recorded git history.

drivers/s390/char/tape_std.h: tape_std_error_recovery_has_failed -
has only declaration since recorded git history.

drivers/s390/char/tape_std.h: tape_std_error_recovery_succeded - has
only declaration since recorded git history.

drivers/s390/char/tape_std.h: tape_std_error_recovery_do_retry - has
only declaration since recorded git history.

drivers/s390/char/tape_std.h: tape_std_error_recovery_read_opposite -
has only declaration since recorded git history.

drivers/s390/char/tape_std.h: tape_std_error_recovery_HWBUG - has only
declaration since recorded git history.
Reviewed-by: NSven Schnelle <svens@linux.ibm.com>
Signed-off-by: NVasily Gorbik <gor@linux.ibm.com>

402e9228

s390/startup: add kaslr_offset to pgm check info print · 3ca8b855

由 Vasily Gorbik 提交于 8月 11, 2019

startup pgm check handler is active since the very beginning of kernel
code execution until uncompressed kernel sets up s390_base_pgm_handler.

It is useful not just for the decompressor debugging itself, but also for
early code of uncompressed kernel, in particular Kasan initialization. But
since there is no stack trace or symbolic representation of failing psw
address it is impossible to figure out faulty code location without
knowing Kaslr kernel base. So, let's add it to the startup pgm check
info printed as well.
Reviewed-by: NSven Schnelle <svens@linux.ibm.com>
Signed-off-by: NVasily Gorbik <gor@linux.ibm.com>

3ca8b855

s390/startup: correct "dfltcc" option parsing · 86cde618

由 Vasily Gorbik 提交于 9月 25, 2020

Currently if just "dfltcc" is passed as a kernel command line option
"val" going to be NULL, this leads to reading at address 0 in
strcmp(val, "off")

Fix that by making sure "val" is not NULL. This does not affect option
handling logic.
Reviewed-by: NSven Schnelle <svens@linux.ibm.com>
Signed-off-by: NVasily Gorbik <gor@linux.ibm.com>

86cde618

s390/vdso: remove orphaned declarations · 3731ac57

由 Vasily Gorbik 提交于 9月 27, 2020

Remove couple of declarations which are unused since commit 4bff8cb5
("s390: convert to GENERIC_VDSO").
Acked-by: NSven Schnelle <svens@linux.ibm.com>
Signed-off-by: NVasily Gorbik <gor@linux.ibm.com>

3731ac57

s390/cio: remove unused channel_subsystem_reinit · 54530ce6

由 Vasily Gorbik 提交于 9月 26, 2020

Added with commit 77e844b9 ("s390/hibernate: add early resume
function") unused since commit 39421627 ("s390: remove broken
hibernate / power management support").
Reviewed-by: NVineeth Vijayan <vneethv@linux.ibm.com>
Signed-off-by: NVasily Gorbik <gor@linux.ibm.com>

54530ce6

s390: remove cad commandline option · ad3e6948

由 Sven Schnelle 提交于 9月 28, 2020

remove the cad command line option as the instruction was never
published and never used by userspace.
Signed-off-by: NSven Schnelle <svens@linux.ibm.com>
Reviewed-by: NVasily Gorbik <gor@linux.ibm.com>
Acked-by: NChristian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: NVasily Gorbik <gor@linux.ibm.com>

ad3e6948

openeuler / Kernel 接近 2 年 前同步成功

openeuler / Kernel
接近 2 年前同步成功