提交 · b2d24b97b2a9691351920e700bfda4368c177232 · openeuler / Kernel

29 4月, 2019 1 次提交

s390/kernel: introduce .dma sections · a80313ff

由 Gerald Schaefer 提交于 2月 03, 2019

With a relocatable kernel that could reside at any place in memory, code
and data that has to stay below 2 GB needs special handling.

This patch introduces .dma sections for such text, data and ex_table.
The sections will be part of the decompressor kernel, so they will not
be relocated and stay below 2 GB. Their location is passed over to the
decompressed / relocated kernel via the .boot.preserved.data section.

The duald and aste for control register setup also need to stay below
2 GB, so move the setup code from arch/s390/kernel/head64.S to
arch/s390/boot/head.S. The duct and linkage_stack could reside above
2 GB, but their content has to be preserved for the decompresed kernel,
so they are also moved into the .dma section.

The start and end address of the .dma sections is added to vmcoreinfo,
for crash support, to help debugging in case the kernel crashed there.
Signed-off-by: NGerald Schaefer <gerald.schaefer@de.ibm.com>
Reviewed-by: NPhilipp Rudo <prudo@linux.ibm.com>
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>

a80313ff

23 4月, 2019 2 次提交

s390/mm: convert to the generic get_user_pages_fast code · 1a42010c

由 Martin Schwidefsky 提交于 4月 23, 2019

Define the gup_fast_permitted to check against the asce_limit of the
mm attached to the current task, then replace the s390 specific gup
code with the generic implementation in mm/gup.c.
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>

1a42010c

s390/mm: make the pxd_offset functions more robust · d1874a0c

由 Martin Schwidefsky 提交于 4月 23, 2019

Change the way how pgd_offset, p4d_offset, pud_offset and pmd_offset
walk the page tables. pgd_offset now always calculates the index for
the top-level page table and adds it to the pgd, this is either a
segment table offset for a 2-level setup, a region-3 offset for 3-levels,
region-2 offset for 4-levels, or a region-1 offset for a 5-level setup.
The other three functions p4d_offset, pud_offset and pmd_offset will
only add the respective offset if they dereference the passed pointer.

With the new way of walking the page tables a sequence like this from
mm/gup.c now works:

     pgdp = pgd_offset(current->mm, addr);
     pgd = READ_ONCE(*pgdp);
     p4dp = p4d_offset(&pgd, addr);
     p4d = READ_ONCE(*p4dp);
     pudp = pud_offset(&p4d, addr);
     pud = READ_ONCE(*pudp);
     pmdp = pmd_offset(&pud, addr);
     pmd = READ_ONCE(*pmdp);
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>

d1874a0c

10 4月, 2019 1 次提交

s390/mm: silence compiler warning when compiling without CONFIG_PGSTE · 81a8f2be

由 Thomas Huth 提交于 4月 07, 2019

If CONFIG_PGSTE is not set (e.g. when compiling without KVM), GCC complains:

  CC      arch/s390/mm/pgtable.o
arch/s390/mm/pgtable.c:413:15: warning: ‘pmd_alloc_map’ defined but not
 used [-Wunused-function]
 static pmd_t *pmd_alloc_map(struct mm_struct *mm, unsigned long addr)
               ^~~~~~~~~~~~~

Wrap the function with "#ifdef CONFIG_PGSTE" to silence the warning.
Signed-off-by: NThomas Huth <thuth@redhat.com>
Reviewed-by: NDavid Hildenbrand <david@redhat.com>
Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>

81a8f2be

06 3月, 2019 2 次提交

mm: update ptep_modify_prot_commit to take old pte value as arg · 04a86453

由 Aneesh Kumar K.V 提交于 3月 05, 2019

Architectures like ppc64 require to do a conditional tlb flush based on
the old and new value of pte.  Enable that by passing old pte value as
the arg.

Link: http://lkml.kernel.org/r/20190116085035.29729-3-aneesh.kumar@linux.ibm.comSigned-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

04a86453

mm: update ptep_modify_prot_start/commit to take vm_area_struct as arg · 0cbe3e26

由 Aneesh Kumar K.V 提交于 3月 05, 2019

Patch series "NestMMU pte upgrade workaround for mprotect", v5.

We can upgrade pte access (R -> RW transition) via mprotect.  We need to
make sure we follow the recommended pte update sequence as outlined in
commit bd5050e3 ("powerpc/mm/radix: Change pte relax sequence to
handle nest MMU hang") for such updates.  This patch series does that.

This patch (of 5):

Some architectures may want to call flush_tlb_range from these helpers.

Link: http://lkml.kernel.org/r/20190116085035.29729-2-aneesh.kumar@linux.ibm.comSigned-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

0cbe3e26

01 3月, 2019 1 次提交

s390: clean up redundant facilities list setup · d8901f2b

由 Vasily Gorbik 提交于 2月 27, 2019

Facilities list in the lowcore is initially set up by verify_facilities
from als.c and later initializations are redundant, so cleaning them up.
Reviewed-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: NVasily Gorbik <gor@linux.ibm.com>
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>

d8901f2b

21 2月, 2019 2 次提交

s390/extmem: print DCSS range with %px · f1777625

由 Gerald Schaefer 提交于 2月 21, 2019

The DCSS range is currently printed with %p, which results in hashed values
instead of the actual addresses.

Use %px instead, the DCSS ranges do not reveal any kernel symbol addresses.
Signed-off-by: NGerald Schaefer <gerald.schaefer@de.ibm.com>
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>

f1777625

s390/extmem: remove code for 31 bit addressing mode · ca571146

由 Gerald Schaefer 提交于 2月 21, 2019

All supported releases of z/VM allow 64 bit subcodes and addressing mode
for diag 0x64.

This patch removes a lot of code for handling 31 bit addressing mode and
old subcodes.
Signed-off-by: NGerald Schaefer <gerald.schaefer@de.ibm.com>
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>

ca571146

07 2月, 2019 1 次提交

s390/mmap: take stack_guard_gap into account for mmap_base · a0308c13

由 Martin Schwidefsky 提交于 1月 29, 2019

The s390 version of the mmap_base function is ignorant of stack_guard_gap
which can lead to a placement of the stack vs. the mmap base that does not
leave enough space for the stack rlimit.

Add the stack_guard_gap to the calculation and while we are at it the
check for gap+pad overflows as well.
Reviewed-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>

a0308c13

18 1月, 2019 1 次提交

s390: remove the ptep_modify_prot_{start,commit} exports · 32b77252

由 Christoph Hellwig 提交于 12月 19, 2018

These two functions are only used by core MM code, so no need to export
them.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>

32b77252

29 12月, 2018 3 次提交

mm, memory_hotplug: add nid parameter to arch_remove_memory · 2c2a5af6

由 Oscar Salvador 提交于 12月 28, 2018

Patch series "Do not touch pages in hot-remove path", v2.

This patchset aims for two things:

 1) A better definition about offline and hot-remove stage
 2) Solving bugs where we can access non-initialized pages
    during hot-remove operations [2] [3].

This is achieved by moving all page/zone handling to the offline
stage, so we do not need to access pages when hot-removing memory.

[1] https://patchwork.kernel.org/cover/10691415/
[2] https://patchwork.kernel.org/patch/10547445/
[3] https://www.spinics.net/lists/linux-mm/msg161316.html

This patch (of 5):

This is a preparation for the following-up patches.  The idea of passing
the nid is that it will allow us to get rid of the zone parameter
afterwards.

Link: http://lkml.kernel.org/r/20181127162005.15833-2-osalvador@suse.deSigned-off-by: NOscar Salvador <osalvador@suse.de>
Reviewed-by: NDavid Hildenbrand <david@redhat.com>
Reviewed-by: NPavel Tatashin <pasha.tatashin@soleen.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Jerome Glisse <jglisse@redhat.com>
Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

2c2a5af6

mm: convert totalram_pages and totalhigh_pages variables to atomic · ca79b0c2

由 Arun KS 提交于 12月 28, 2018

totalram_pages and totalhigh_pages are made static inline function.

Main motivation was that managed_page_count_lock handling was complicating
things.  It was discussed in length here,
https://lore.kernel.org/patchwork/patch/995739/#1181785 So it seemes
better to remove the lock and convert variables to atomic, with preventing
poteintial store-to-read tearing as a bonus.

[akpm@linux-foundation.org: coding style fixes]
Link: http://lkml.kernel.org/r/1542090790-21750-4-git-send-email-arunks@codeaurora.orgSigned-off-by: NArun KS <arunks@codeaurora.org>
Suggested-by: NMichal Hocko <mhocko@suse.com>
Suggested-by: NVlastimil Babka <vbabka@suse.cz>
Reviewed-by: NKonstantin Khlebnikov <khlebnikov@yandex-team.ru>
Reviewed-by: NPavel Tatashin <pasha.tatashin@soleen.com>
Acked-by: NMichal Hocko <mhocko@suse.com>
Acked-by: NVlastimil Babka <vbabka@suse.cz>
Cc: David Hildenbrand <david@redhat.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

ca79b0c2

kasan: rename kasan_zero_page to kasan_early_shadow_page · 9577dd74

由 Andrey Konovalov 提交于 12月 28, 2018

With tag based KASAN mode the early shadow value is 0xff and not 0x00, so
this patch renames kasan_zero_(page|pte|pmd|pud|p4d) to
kasan_early_shadow_(page|pte|pmd|pud|p4d) to avoid confusion.

Link: http://lkml.kernel.org/r/3fed313280ebf4f88645f5b89ccbc066d320e177.1544099024.git.andreyknvl@google.comSigned-off-by: NAndrey Konovalov <andreyknvl@google.com>
Suggested-by: NMark Rutland <mark.rutland@arm.com>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

9577dd74

30 11月, 2018 1 次提交

s390: use common bust_spinlocks() · 5b39fc04

由 Sergey Senozhatsky 提交于 10月 25, 2018

s390 is the only architecture that is using own bust_spinlocks()
variant, while other arch-s seem to be OK with the common
implementation.

Heiko Carstens [1] said he would prefer s390 to use the common
bust_spinlocks() as well:
  I did some code archaeology and this function is unchanged since ~17
  years. When it was introduced it was close to identical to the x86
  variant. All other architectures use the common code variant in the
  meantime. So if we change this I'd prefer that we switch s390 to the
  common code variant as well. Right now I can't see a reason for not
  doing that

This patch removes s390 bust_spinlocks() and drops the weak attribute
from the common bust_spinlocks() version.

[1] lkml.kernel.org/r/20181025062800.GB4037@osiris
Signed-off-by: NSergey Senozhatsky <sergey.senozhatsky@gmail.com>
Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>

5b39fc04

27 11月, 2018 1 次提交

s390/mm: correct pgtable_bytes on page table downgrade · 814cedbc

由 Martin Schwidefsky 提交于 11月 27, 2018

The downgrade of a page table from 3 levels to 2 levels for a 31-bit compat
process removes a pmd table which has to be counted against pgtable_bytes.
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>

814cedbc

09 11月, 2018 1 次提交

s390/mm: Convert tlb_table_flush() to use call_rcu() · 0d4e68e2

由 Paul E. McKenney 提交于 10月 30, 2018

Now that call_rcu()'s callback is not invoked until after all
preempt-disable regions of code have completed (in addition to explicitly
marked RCU read-side critical sections), call_rcu() can be used in place
of call_rcu_sched().  This commit therefore makes that change.
Signed-off-by: NPaul E. McKenney <paulmck@linux.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: <linux-s390@vger.kernel.org>

0d4e68e2

02 11月, 2018 1 次提交

s390/mm: fix mis-accounting of pgtable_bytes · e12e4044

由 Martin Schwidefsky 提交于 10月 15, 2018

In case a fork or a clone system fails in copy_process and the error
handling does the mmput() at the bad_fork_cleanup_mm label, the
following warning messages will appear on the console:

BUG: non-zero pgtables_bytes on freeing mm: 16384

The reason for that is the tricks we play with mm_inc_nr_puds() and
mm_inc_nr_pmds() in init_new_context().

A normal 64-bit process has 3 levels of page table, the p4d level and
the pud level are folded. On process termination the free_pud_range()
function in mm/memory.c will subtract 16KB from pgtable_bytes with a
mm_dec_nr_puds() call, but there actually is not really a pud table.

One issue with this is the fact that pgtable_bytes is usually off
by a few kilobytes, but the more severe problem is that for a failed
fork or clone the free_pgtables() function is not called. In this case
there is no mm_dec_nr_puds() or mm_dec_nr_pmds() that go together with
the mm_inc_nr_puds() and mm_inc_nr_pmds in init_new_context().
The pgtable_bytes will be off by 16384 or 32768 bytes and we get the
BUG message. The message itself is purely cosmetic, but annoying.

To fix this override the mm_pmd_folded, mm_pud_folded and mm_p4d_folded
function to check for the true size of the address space.
Reported-by: NLi Wang <liwang@redhat.com>
Tested-by: NLi Wang <liwang@redhat.com>
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>

e12e4044

31 10月, 2018 3 次提交

mm: remove include/linux/bootmem.h · 57c8a661

由 Mike Rapoport 提交于 10月 30, 2018

Move remaining definitions and declarations from include/linux/bootmem.h
into include/linux/memblock.h and remove the redundant header.

The includes were replaced with the semantic patch below and then
semi-automated removal of duplicated '#include <linux/memblock.h>

@@
@@
- #include <linux/bootmem.h>
+ #include <linux/memblock.h>

[sfr@canb.auug.org.au: dma-direct: fix up for the removal of linux/bootmem.h]
  Link: http://lkml.kernel.org/r/20181002185342.133d1680@canb.auug.org.au
[sfr@canb.auug.org.au: powerpc: fix up for removal of linux/bootmem.h]
  Link: http://lkml.kernel.org/r/20181005161406.73ef8727@canb.auug.org.au
[sfr@canb.auug.org.au: x86/kaslr, ACPI/NUMA: fix for linux/bootmem.h removal]
  Link: http://lkml.kernel.org/r/20181008190341.5e396491@canb.auug.org.au
Link: http://lkml.kernel.org/r/1536927045-23536-30-git-send-email-rppt@linux.vnet.ibm.comSigned-off-by: NMike Rapoport <rppt@linux.vnet.ibm.com>
Signed-off-by: NStephen Rothwell <sfr@canb.auug.org.au>
Acked-by: NMichal Hocko <mhocko@suse.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Chris Zankel <chris@zankel.net>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Greentime Hu <green.hu@gmail.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Guan Xuetao <gxt@pku.edu.cn>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "James E.J. Bottomley" <jejb@parisc-linux.org>
Cc: Jonas Bonn <jonas@southpole.se>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Ley Foon Tan <lftan@altera.com>
Cc: Mark Salter <msalter@redhat.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Palmer Dabbelt <palmer@sifive.com>
Cc: Paul Burton <paul.burton@mips.com>
Cc: Richard Kuo <rkuo@codeaurora.org>
Cc: Richard Weinberger <richard@nod.at>
Cc: Rich Felker <dalias@libc.org>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Serge Semin <fancer.lancer@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Vineet Gupta <vgupta@synopsys.com>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

57c8a661

memblock: rename free_all_bootmem to memblock_free_all · c6ffc5ca

由 Mike Rapoport 提交于 10月 30, 2018

The conversion is done using

sed -i 's@free_all_bootmem@memblock_free_all@' \
    $(git grep -l free_all_bootmem)

Link: http://lkml.kernel.org/r/1536927045-23536-26-git-send-email-rppt@linux.vnet.ibm.comSigned-off-by: NMike Rapoport <rppt@linux.vnet.ibm.com>
Acked-by: NMichal Hocko <mhocko@suse.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Chris Zankel <chris@zankel.net>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Greentime Hu <green.hu@gmail.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Guan Xuetao <gxt@pku.edu.cn>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "James E.J. Bottomley" <jejb@parisc-linux.org>
Cc: Jonas Bonn <jonas@southpole.se>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Ley Foon Tan <lftan@altera.com>
Cc: Mark Salter <msalter@redhat.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Palmer Dabbelt <palmer@sifive.com>
Cc: Paul Burton <paul.burton@mips.com>
Cc: Richard Kuo <rkuo@codeaurora.org>
Cc: Richard Weinberger <richard@nod.at>
Cc: Rich Felker <dalias@libc.org>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Serge Semin <fancer.lancer@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Vineet Gupta <vgupta@synopsys.com>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

c6ffc5ca

memblock: rename memblock_alloc{_nid,_try_nid} to memblock_phys_alloc* · 9a8dd708

由 Mike Rapoport 提交于 10月 30, 2018

Make it explicit that the caller gets a physical address rather than a
virtual one.

This will also allow using meblock_alloc prefix for memblock allocations
returning virtual address, which is done in the following patches.

The conversion is done using the following semantic patch:

@@
expression e1, e2, e3;
@@
(
- memblock_alloc(e1, e2)
+ memblock_phys_alloc(e1, e2)
|
- memblock_alloc_nid(e1, e2, e3)
+ memblock_phys_alloc_nid(e1, e2, e3)
|
- memblock_alloc_try_nid(e1, e2, e3)
+ memblock_phys_alloc_try_nid(e1, e2, e3)
)

Link: http://lkml.kernel.org/r/1536927045-23536-7-git-send-email-rppt@linux.vnet.ibm.comSigned-off-by: NMike Rapoport <rppt@linux.vnet.ibm.com>
Acked-by: NMichal Hocko <mhocko@suse.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Chris Zankel <chris@zankel.net>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Greentime Hu <green.hu@gmail.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Guan Xuetao <gxt@pku.edu.cn>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "James E.J. Bottomley" <jejb@parisc-linux.org>
Cc: Jonas Bonn <jonas@southpole.se>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Ley Foon Tan <lftan@altera.com>
Cc: Mark Salter <msalter@redhat.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Palmer Dabbelt <palmer@sifive.com>
Cc: Paul Burton <paul.burton@mips.com>
Cc: Richard Kuo <rkuo@codeaurora.org>
Cc: Richard Weinberger <richard@nod.at>
Cc: Rich Felker <dalias@libc.org>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Serge Semin <fancer.lancer@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Vineet Gupta <vgupta@synopsys.com>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

9a8dd708

22 10月, 2018 1 次提交

s390/kasan: support preemptible kernel build · cf3dbe5d

由 Vasily Gorbik 提交于 10月 19, 2018

When the kernel is built with:
CONFIG_PREEMPT=y
CONFIG_PREEMPT_COUNT=y
"stfle" function used by kasan initialization code makes additional
call to preempt_count_add/preempt_count_sub. To avoid removing kasan
instrumentation from sched code where those functions leave split stfle
function and provide __stfle variant without preemption handling to be
used by Kasan.
Reported-by: NBenjamin Block <bblock@linux.ibm.com>
Acked-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: NVasily Gorbik <gor@linux.ibm.com>
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>

cf3dbe5d

09 10月, 2018 14 次提交

s390/kasan: add support for mem= kernel parameter · 78333d1f

由 Vasily Gorbik 提交于 9月 26, 2018

Handle mem= kernel parameter in kasan to limit physical memory.
Reviewed-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: NVasily Gorbik <gor@linux.ibm.com>
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>

78333d1f

s390/kasan: optimize kasan vmemmap allocation · 12e55fa1

由 Vasily Gorbik 提交于 9月 13, 2018

Kasan implementation now supports memory hotplug operations. For that
reason regions of initially standby memory are now skipped from
shadow mapping and are mapped/unmapped dynamically upon bringing
memory online/offline.
Signed-off-by: NVasily Gorbik <gor@linux.ibm.com>
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>

12e55fa1

s390/kasan: avoid kasan crash with standby memory defined · 29635239

由 Vasily Gorbik 提交于 9月 13, 2018

Kasan early memory allocator simply chops off memory blocks from the
end of the physical memory. Reuse mem_detect info to identify actual
online memory end rather than using max_physmem_end. This allows to run
the kernel with kasan enabled and standby memory defined.
Signed-off-by: NVasily Gorbik <gor@linux.ibm.com>
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>

29635239

s390/mm: improve debugfs ptdump markers walking · 6cad0eb5

由 Vasily Gorbik 提交于 11月 17, 2017

This allows to print multiple markers when they happened to have the
same value.

...
0x001bfffff0100000-0x001c000000000000       255M PMD I
---[ Kasan Shadow End ]---
---[ vmemmap Area ]---
0x001c000000000000-0x001c000002000000        32M PMD RW X
...
Reviewed-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: NVasily Gorbik <gor@linux.ibm.com>
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>

6cad0eb5

s390/mm: optimize debugfs ptdump kasan zero page walking · e006222b

由 Vasily Gorbik 提交于 11月 17, 2017

Kasan zero p4d/pud/pmd/pte are always filled in with corresponding
kasan zero entries. Walking kasan zero page backed area is time
consuming and unnecessary. When kasan zero p4d/pud/pmd is encountered,
it eventually points to the kasan zero page always with the same
attributes and nothing but it, therefore zero p4d/pud/pmd could
be jumped over.

Also adds a space between address range and pages number to separate
them from each other when pages number is huge.

0x0018000000000000-0x0018000010000000       256M PMD RW X
0x0018000010000000-0x001bfffff0000000 1073741312M PTE RO X
0x001bfffff0000000-0x001bfffff0001000         4K PTE RW X
Reviewed-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: NVasily Gorbik <gor@linux.ibm.com>
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>

e006222b

s390/kasan: add option for 4-level paging support · 5dff0381

由 Vasily Gorbik 提交于 11月 19, 2017

By default 3-level paging is used when the kernel is compiled with
kasan support. Add 4-level paging option to support systems with more
then 3TB of physical memory and to cover 4-level paging specific code
with kasan as well.
Reviewed-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: NVasily Gorbik <gor@linux.ibm.com>
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>

5dff0381

s390/kasan: free early identity mapping structures · 135ff163

由 Vasily Gorbik 提交于 11月 20, 2017

Kasan initialization code is changed to populate persistent shadow
first, save allocator position into pgalloc_freeable and proceed with
early identity mapping creation. This way early identity mapping paging
structures could be freed at once after switching to swapper_pg_dir
when early identity mapping is not needed anymore.
Reviewed-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: NVasily Gorbik <gor@linux.ibm.com>
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>

135ff163

s390/kasan: use noexec and large pages · d58106c3

由 Vasily Gorbik 提交于 11月 17, 2017

To lower memory footprint and speed up kasan initialisation detect
EDAT availability and use large pages if possible. As we know how
much memory is needed for initialisation, another simplistic large
page allocator is introduced to avoid memory fragmentation.

Since facilities list is retrieved anyhow, detect noexec support and
adjust pages attributes. Handle noexec kernel option to avoid inconsistent
kasan shadow memory pages flags.
Signed-off-by: NVasily Gorbik <gor@linux.ibm.com>
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>

d58106c3

s390/kasan: dynamic shadow mem allocation for modules · 793213a8

由 Vasily Gorbik 提交于 11月 17, 2017

Move from modules area entire shadow memory preallocation to dynamic
allocation per module load.

This behaivior has been introduced for x86 with bebf56a1: "This patch
also forces module_alloc() to return 8*PAGE_SIZE aligned address making
shadow memory handling ( kasan_module_alloc()/kasan_module_free() )
more simple. Such alignment guarantees that each shadow page backing
modules address space correspond to only one module_alloc() allocation"
Signed-off-by: NVasily Gorbik <gor@linux.ibm.com>
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>

793213a8

s390/mm: add kasan shadow to the debugfs pgtable dump · 0dac8f6b

由 Vasily Gorbik 提交于 11月 17, 2017

This change adds address space markers for kasan shadow memory.
Reviewed-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: NVasily Gorbik <gor@linux.ibm.com>
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>

0dac8f6b

s390/kasan: add initialization code and enable it · 42db5ed8

由 Vasily Gorbik 提交于 11月 17, 2017

Kasan needs 1/8 of kernel virtual address space to be reserved as the
shadow area. And eventually it requires the shadow memory offset to be
known at compile time (passed to the compiler when full instrumentation
is enabled).  Any value picked as the shadow area offset for 3-level
paging would eat up identity mapping on 4-level paging (with 1PB
shadow area size). So, the kernel sticks to 3-level paging when kasan
is enabled. 3TB border is picked as the shadow offset.  The memory
layout is adjusted so, that physical memory border does not exceed
KASAN_SHADOW_START and vmemmap does not go below KASAN_SHADOW_END.

Due to the fact that on s390 paging is set up very late and to cover
more code with kasan instrumentation, temporary identity mapping and
final shadow memory are set up early. The shadow memory mapping is
later carried over to init_mm.pgd during paging_init.

For the needs of paging structures allocation and shadow memory
population a primitive allocator is used, which simply chops off
memory blocks from the end of the physical memory.

Kasan currenty doesn't track vmemmap and vmalloc areas.

Current memory layout (for 3-level paging, 2GB physical memory).

---[ Identity Mapping ]---
0x0000000000000000-0x0000000000100000
---[ Kernel Image Start ]---
0x0000000000100000-0x0000000002b00000
---[ Kernel Image End ]---
0x0000000002b00000-0x0000000080000000        2G <- physical memory border
0x0000000080000000-0x0000030000000000     3070G PUD I
---[ Kasan Shadow Start ]---
0x0000030000000000-0x0000030010000000      256M PMD RW X  <- shadow for 2G memory
0x0000030010000000-0x0000037ff0000000   523776M PTE RO NX <- kasan zero ro page
0x0000037ff0000000-0x0000038000000000      256M PMD RW X  <- shadow for 2G modules
---[ Kasan Shadow End ]---
0x0000038000000000-0x000003d100000000      324G PUD I
---[ vmemmap Area ]---
0x000003d100000000-0x000003e080000000
---[ vmalloc Area ]---
0x000003e080000000-0x000003ff80000000
---[ Modules Area ]---
0x000003ff80000000-0x0000040000000000        2G
Acked-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: NVasily Gorbik <gor@linux.ibm.com>
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>

42db5ed8

s390/mem_detect: move tprot loop to early boot phase · 6966d604

由 Vasily Gorbik 提交于 4月 11, 2018

Move memory detection to early boot phase. To store online memory
regions "struct mem_detect_info" has been introduced together with
for_each_mem_detect_block iterator. mem_detect_info is later converted
to memblock.

Also introduces sclp_early_get_meminfo function to get maximum physical
memory and maximum increment number.
Reviewed-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
Reviewed-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: NVasily Gorbik <gor@linux.ibm.com>
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>

6966d604

s390: add support for virtually mapped kernel stacks · ce3dc447

由 Martin Schwidefsky 提交于 9月 12, 2017

With virtually mapped kernel stacks the kernel stack overflow detection
is now fault based, every stack has a guard page in the vmalloc space.
The panic_stack is renamed to nodat_stack and is used for all function
that need to run without DAT, e.g. memcpy_real or do_start_kdump.

The main effect is a reduction in the kernel image size as with vmap
stacks the old style overflow checking that adds two instructions per
function is not needed anymore. Result from bloat-o-meter:

add/remove: 20/1 grow/shrink: 13/26854 up/down: 2198/-216240 (-214042)

In regard to performance the micro-benchmark for fork has a hit of a
few microseconds, allocating 4 pages in vmalloc space is more expensive
compare to an order-2 page allocation. But with real workload I could
not find a noticeable difference.
Acked-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>

ce3dc447

s390/pfault: do not use stack buffers for hardware data · 00e9e664

由 Martin Schwidefsky 提交于 9月 07, 2018

With CONFIG_VMAP_STACK=y the stack is allocated from the vmalloc space.
Data structures passed to a hardware or a hypervisor interface that
requires V=R can not be allocated on the stack anymore.

Make the init and fini pfault parameter blocks static variables.
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>

00e9e664

01 10月, 2018 1 次提交

s390/mm: optimize locking without huge pages in gmap_pmd_op_walk() · af4bf6c3

由 David Hildenbrand 提交于 8月 06, 2018

Right now we temporarily take the page table lock in
gmap_pmd_op_walk() even though we know we won't need it (if we can
never have 1mb pages mapped into the gmap).

Let's make this a special case, so gmap_protect_range() and
gmap_sync_dirty_log_pmd() will not take the lock when huge pages are
not allowed.

gmap_protect_range() is called quite frequently for managing shadow
page tables in vSIE environments.
Signed-off-by: NDavid Hildenbrand <david@redhat.com>
Reviewed-by: NJanosch Frank <frankja@linux.ibm.com>
Message-Id: <20180806155407.15252-1-david@redhat.com>
Signed-off-by: NJanosch Frank <frankja@linux.ibm.com>

af4bf6c3

12 9月, 2018 1 次提交

s390/mm: Check for valid vma before zapping in gmap_discard · 1843abd0

由 Janosch Frank 提交于 8月 16, 2018

Userspace could have munmapped the area before doing unmapping from
the gmap. This would leave us with a valid vmaddr, but an invalid vma
from which we would try to zap memory.

Let's check before using the vma.

Fixes: 1e133ab2 ("s390/mm: split arch/s390/mm/pgtable.c")
Signed-off-by: NJanosch Frank <frankja@linux.ibm.com>
Reviewed-by: NDavid Hildenbrand <david@redhat.com>
Reported-by: NDan Carpenter <dan.carpenter@oracle.com>
Message-Id: <20180816082432.78828-1-frankja@linux.ibm.com>
Signed-off-by: NJanosch Frank <frankja@linux.ibm.com>

1843abd0

18 8月, 2018 1 次提交

mm: convert return type of handle_mm_fault() caller to vm_fault_t · 50a7ca3c

由 Souptick Joarder 提交于 8月 17, 2018

Use new return type vm_fault_t for fault handler.  For now, this is just
documenting that the function returns a VM_FAULT value rather than an
errno.  Once all instances are converted, vm_fault_t will become a
distinct type.

Ref-> commit 1c8f4220 ("mm: change return type to vm_fault_t")

In this patch all the caller of handle_mm_fault() are changed to return
vm_fault_t type.

Link: http://lkml.kernel.org/r/20180617084810.GA6730@jordon-HP-15-Notebook-PCSigned-off-by: NSouptick Joarder <jrdr.linux@gmail.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Vineet Gupta <vgupta@synopsys.com>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Richard Kuo <rkuo@codeaurora.org>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Michal Simek <monstr@monstr.eu>
Cc: James Hogan <jhogan@kernel.org>
Cc: Ley Foon Tan <lftan@altera.com>
Cc: Jonas Bonn <jonas@southpole.se>
Cc: James E.J. Bottomley <jejb@parisc-linux.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Palmer Dabbelt <palmer@sifive.com>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: David S. Miller <davem@davemloft.net>
Cc: Richard Weinberger <richard@nod.at>
Cc: Guan Xuetao <gxt@pku.edu.cn>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: "Levin, Alexander (Sasha Levin)" <alexander.levin@verizon.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

50a7ca3c

09 8月, 2018 1 次提交

s390/mm: fix addressing exception after suspend/resume · 37a366fa

由 Gerald Schaefer 提交于 8月 07, 2018

Commit c9b5ad54 "s390/mm: tag normal pages vs pages used in page tables"
accidentally changed the logic in arch_set_page_states(), which is used by
the suspend/resume code. set_page_stable(page, order) was changed to
set_page_stable_dat(page, 0). After this, only the first page of higher order
pages will be set to stable, and a write to one of the unstable pages will
result in an addressing exception.

Fix this by using "order" again, instead of "0".

Fixes: c9b5ad54 ("s390/mm: tag normal pages vs pages used in page tables")
Cc: stable@vger.kernel.org # 4.14+
Reviewed-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: NGerald Schaefer <gerald.schaefer@de.ibm.com>
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>

37a366fa

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功