- 29 4月, 2019 1 次提交
-
-
由 Gerald Schaefer 提交于
With a relocatable kernel that could reside at any place in memory, code and data that has to stay below 2 GB needs special handling. This patch introduces .dma sections for such text, data and ex_table. The sections will be part of the decompressor kernel, so they will not be relocated and stay below 2 GB. Their location is passed over to the decompressed / relocated kernel via the .boot.preserved.data section. The duald and aste for control register setup also need to stay below 2 GB, so move the setup code from arch/s390/kernel/head64.S to arch/s390/boot/head.S. The duct and linkage_stack could reside above 2 GB, but their content has to be preserved for the decompresed kernel, so they are also moved into the .dma section. The start and end address of the .dma sections is added to vmcoreinfo, for crash support, to help debugging in case the kernel crashed there. Signed-off-by: NGerald Schaefer <gerald.schaefer@de.ibm.com> Reviewed-by: NPhilipp Rudo <prudo@linux.ibm.com> Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
-
- 23 4月, 2019 2 次提交
-
-
由 Martin Schwidefsky 提交于
Define the gup_fast_permitted to check against the asce_limit of the mm attached to the current task, then replace the s390 specific gup code with the generic implementation in mm/gup.c. Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
-
由 Martin Schwidefsky 提交于
Change the way how pgd_offset, p4d_offset, pud_offset and pmd_offset walk the page tables. pgd_offset now always calculates the index for the top-level page table and adds it to the pgd, this is either a segment table offset for a 2-level setup, a region-3 offset for 3-levels, region-2 offset for 4-levels, or a region-1 offset for a 5-level setup. The other three functions p4d_offset, pud_offset and pmd_offset will only add the respective offset if they dereference the passed pointer. With the new way of walking the page tables a sequence like this from mm/gup.c now works: pgdp = pgd_offset(current->mm, addr); pgd = READ_ONCE(*pgdp); p4dp = p4d_offset(&pgd, addr); p4d = READ_ONCE(*p4dp); pudp = pud_offset(&p4d, addr); pud = READ_ONCE(*pudp); pmdp = pmd_offset(&pud, addr); pmd = READ_ONCE(*pmdp); Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
-
- 10 4月, 2019 1 次提交
-
-
由 Thomas Huth 提交于
If CONFIG_PGSTE is not set (e.g. when compiling without KVM), GCC complains: CC arch/s390/mm/pgtable.o arch/s390/mm/pgtable.c:413:15: warning: ‘pmd_alloc_map’ defined but not used [-Wunused-function] static pmd_t *pmd_alloc_map(struct mm_struct *mm, unsigned long addr) ^~~~~~~~~~~~~ Wrap the function with "#ifdef CONFIG_PGSTE" to silence the warning. Signed-off-by: NThomas Huth <thuth@redhat.com> Reviewed-by: NDavid Hildenbrand <david@redhat.com> Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
-
- 06 3月, 2019 2 次提交
-
-
由 Aneesh Kumar K.V 提交于
Architectures like ppc64 require to do a conditional tlb flush based on the old and new value of pte. Enable that by passing old pte value as the arg. Link: http://lkml.kernel.org/r/20190116085035.29729-3-aneesh.kumar@linux.ibm.comSigned-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Aneesh Kumar K.V 提交于
Patch series "NestMMU pte upgrade workaround for mprotect", v5. We can upgrade pte access (R -> RW transition) via mprotect. We need to make sure we follow the recommended pte update sequence as outlined in commit bd5050e3 ("powerpc/mm/radix: Change pte relax sequence to handle nest MMU hang") for such updates. This patch series does that. This patch (of 5): Some architectures may want to call flush_tlb_range from these helpers. Link: http://lkml.kernel.org/r/20190116085035.29729-2-aneesh.kumar@linux.ibm.comSigned-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 01 3月, 2019 1 次提交
-
-
由 Vasily Gorbik 提交于
Facilities list in the lowcore is initially set up by verify_facilities from als.c and later initializations are redundant, so cleaning them up. Reviewed-by: NHeiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: NVasily Gorbik <gor@linux.ibm.com> Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
-
- 21 2月, 2019 2 次提交
-
-
由 Gerald Schaefer 提交于
The DCSS range is currently printed with %p, which results in hashed values instead of the actual addresses. Use %px instead, the DCSS ranges do not reveal any kernel symbol addresses. Signed-off-by: NGerald Schaefer <gerald.schaefer@de.ibm.com> Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
-
由 Gerald Schaefer 提交于
All supported releases of z/VM allow 64 bit subcodes and addressing mode for diag 0x64. This patch removes a lot of code for handling 31 bit addressing mode and old subcodes. Signed-off-by: NGerald Schaefer <gerald.schaefer@de.ibm.com> Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
-
- 07 2月, 2019 1 次提交
-
-
由 Martin Schwidefsky 提交于
The s390 version of the mmap_base function is ignorant of stack_guard_gap which can lead to a placement of the stack vs. the mmap base that does not leave enough space for the stack rlimit. Add the stack_guard_gap to the calculation and while we are at it the check for gap+pad overflows as well. Reviewed-by: NHeiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
-
- 18 1月, 2019 1 次提交
-
-
由 Christoph Hellwig 提交于
These two functions are only used by core MM code, so no need to export them. Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
-
- 29 12月, 2018 3 次提交
-
-
由 Oscar Salvador 提交于
Patch series "Do not touch pages in hot-remove path", v2. This patchset aims for two things: 1) A better definition about offline and hot-remove stage 2) Solving bugs where we can access non-initialized pages during hot-remove operations [2] [3]. This is achieved by moving all page/zone handling to the offline stage, so we do not need to access pages when hot-removing memory. [1] https://patchwork.kernel.org/cover/10691415/ [2] https://patchwork.kernel.org/patch/10547445/ [3] https://www.spinics.net/lists/linux-mm/msg161316.html This patch (of 5): This is a preparation for the following-up patches. The idea of passing the nid is that it will allow us to get rid of the zone parameter afterwards. Link: http://lkml.kernel.org/r/20181127162005.15833-2-osalvador@suse.deSigned-off-by: NOscar Salvador <osalvador@suse.de> Reviewed-by: NDavid Hildenbrand <david@redhat.com> Reviewed-by: NPavel Tatashin <pasha.tatashin@soleen.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Jerome Glisse <jglisse@redhat.com> Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com> Cc: "Rafael J. Wysocki" <rafael@kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Arun KS 提交于
totalram_pages and totalhigh_pages are made static inline function. Main motivation was that managed_page_count_lock handling was complicating things. It was discussed in length here, https://lore.kernel.org/patchwork/patch/995739/#1181785 So it seemes better to remove the lock and convert variables to atomic, with preventing poteintial store-to-read tearing as a bonus. [akpm@linux-foundation.org: coding style fixes] Link: http://lkml.kernel.org/r/1542090790-21750-4-git-send-email-arunks@codeaurora.orgSigned-off-by: NArun KS <arunks@codeaurora.org> Suggested-by: NMichal Hocko <mhocko@suse.com> Suggested-by: NVlastimil Babka <vbabka@suse.cz> Reviewed-by: NKonstantin Khlebnikov <khlebnikov@yandex-team.ru> Reviewed-by: NPavel Tatashin <pasha.tatashin@soleen.com> Acked-by: NMichal Hocko <mhocko@suse.com> Acked-by: NVlastimil Babka <vbabka@suse.cz> Cc: David Hildenbrand <david@redhat.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Andrey Konovalov 提交于
With tag based KASAN mode the early shadow value is 0xff and not 0x00, so this patch renames kasan_zero_(page|pte|pmd|pud|p4d) to kasan_early_shadow_(page|pte|pmd|pud|p4d) to avoid confusion. Link: http://lkml.kernel.org/r/3fed313280ebf4f88645f5b89ccbc066d320e177.1544099024.git.andreyknvl@google.comSigned-off-by: NAndrey Konovalov <andreyknvl@google.com> Suggested-by: NMark Rutland <mark.rutland@arm.com> Cc: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Christoph Lameter <cl@linux.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 30 11月, 2018 1 次提交
-
-
由 Sergey Senozhatsky 提交于
s390 is the only architecture that is using own bust_spinlocks() variant, while other arch-s seem to be OK with the common implementation. Heiko Carstens [1] said he would prefer s390 to use the common bust_spinlocks() as well: I did some code archaeology and this function is unchanged since ~17 years. When it was introduced it was close to identical to the x86 variant. All other architectures use the common code variant in the meantime. So if we change this I'd prefer that we switch s390 to the common code variant as well. Right now I can't see a reason for not doing that This patch removes s390 bust_spinlocks() and drops the weak attribute from the common bust_spinlocks() version. [1] lkml.kernel.org/r/20181025062800.GB4037@osiris Signed-off-by: NSergey Senozhatsky <sergey.senozhatsky@gmail.com> Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
-
- 27 11月, 2018 1 次提交
-
-
由 Martin Schwidefsky 提交于
The downgrade of a page table from 3 levels to 2 levels for a 31-bit compat process removes a pmd table which has to be counted against pgtable_bytes. Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
-
- 09 11月, 2018 1 次提交
-
-
由 Paul E. McKenney 提交于
Now that call_rcu()'s callback is not invoked until after all preempt-disable regions of code have completed (in addition to explicitly marked RCU read-side critical sections), call_rcu() can be used in place of call_rcu_sched(). This commit therefore makes that change. Signed-off-by: NPaul E. McKenney <paulmck@linux.ibm.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: <linux-s390@vger.kernel.org>
-
- 02 11月, 2018 1 次提交
-
-
由 Martin Schwidefsky 提交于
In case a fork or a clone system fails in copy_process and the error handling does the mmput() at the bad_fork_cleanup_mm label, the following warning messages will appear on the console: BUG: non-zero pgtables_bytes on freeing mm: 16384 The reason for that is the tricks we play with mm_inc_nr_puds() and mm_inc_nr_pmds() in init_new_context(). A normal 64-bit process has 3 levels of page table, the p4d level and the pud level are folded. On process termination the free_pud_range() function in mm/memory.c will subtract 16KB from pgtable_bytes with a mm_dec_nr_puds() call, but there actually is not really a pud table. One issue with this is the fact that pgtable_bytes is usually off by a few kilobytes, but the more severe problem is that for a failed fork or clone the free_pgtables() function is not called. In this case there is no mm_dec_nr_puds() or mm_dec_nr_pmds() that go together with the mm_inc_nr_puds() and mm_inc_nr_pmds in init_new_context(). The pgtable_bytes will be off by 16384 or 32768 bytes and we get the BUG message. The message itself is purely cosmetic, but annoying. To fix this override the mm_pmd_folded, mm_pud_folded and mm_p4d_folded function to check for the true size of the address space. Reported-by: NLi Wang <liwang@redhat.com> Tested-by: NLi Wang <liwang@redhat.com> Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
-
- 31 10月, 2018 3 次提交
-
-
由 Mike Rapoport 提交于
Move remaining definitions and declarations from include/linux/bootmem.h into include/linux/memblock.h and remove the redundant header. The includes were replaced with the semantic patch below and then semi-automated removal of duplicated '#include <linux/memblock.h> @@ @@ - #include <linux/bootmem.h> + #include <linux/memblock.h> [sfr@canb.auug.org.au: dma-direct: fix up for the removal of linux/bootmem.h] Link: http://lkml.kernel.org/r/20181002185342.133d1680@canb.auug.org.au [sfr@canb.auug.org.au: powerpc: fix up for removal of linux/bootmem.h] Link: http://lkml.kernel.org/r/20181005161406.73ef8727@canb.auug.org.au [sfr@canb.auug.org.au: x86/kaslr, ACPI/NUMA: fix for linux/bootmem.h removal] Link: http://lkml.kernel.org/r/20181008190341.5e396491@canb.auug.org.au Link: http://lkml.kernel.org/r/1536927045-23536-30-git-send-email-rppt@linux.vnet.ibm.comSigned-off-by: NMike Rapoport <rppt@linux.vnet.ibm.com> Signed-off-by: NStephen Rothwell <sfr@canb.auug.org.au> Acked-by: NMichal Hocko <mhocko@suse.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Chris Zankel <chris@zankel.net> Cc: "David S. Miller" <davem@davemloft.net> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Greentime Hu <green.hu@gmail.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Guan Xuetao <gxt@pku.edu.cn> Cc: Ingo Molnar <mingo@redhat.com> Cc: "James E.J. Bottomley" <jejb@parisc-linux.org> Cc: Jonas Bonn <jonas@southpole.se> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Ley Foon Tan <lftan@altera.com> Cc: Mark Salter <msalter@redhat.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Matt Turner <mattst88@gmail.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Simek <monstr@monstr.eu> Cc: Palmer Dabbelt <palmer@sifive.com> Cc: Paul Burton <paul.burton@mips.com> Cc: Richard Kuo <rkuo@codeaurora.org> Cc: Richard Weinberger <richard@nod.at> Cc: Rich Felker <dalias@libc.org> Cc: Russell King <linux@armlinux.org.uk> Cc: Serge Semin <fancer.lancer@gmail.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tony Luck <tony.luck@intel.com> Cc: Vineet Gupta <vgupta@synopsys.com> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Mike Rapoport 提交于
The conversion is done using sed -i 's@free_all_bootmem@memblock_free_all@' \ $(git grep -l free_all_bootmem) Link: http://lkml.kernel.org/r/1536927045-23536-26-git-send-email-rppt@linux.vnet.ibm.comSigned-off-by: NMike Rapoport <rppt@linux.vnet.ibm.com> Acked-by: NMichal Hocko <mhocko@suse.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Chris Zankel <chris@zankel.net> Cc: "David S. Miller" <davem@davemloft.net> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Greentime Hu <green.hu@gmail.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Guan Xuetao <gxt@pku.edu.cn> Cc: Ingo Molnar <mingo@redhat.com> Cc: "James E.J. Bottomley" <jejb@parisc-linux.org> Cc: Jonas Bonn <jonas@southpole.se> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Ley Foon Tan <lftan@altera.com> Cc: Mark Salter <msalter@redhat.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Matt Turner <mattst88@gmail.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Simek <monstr@monstr.eu> Cc: Palmer Dabbelt <palmer@sifive.com> Cc: Paul Burton <paul.burton@mips.com> Cc: Richard Kuo <rkuo@codeaurora.org> Cc: Richard Weinberger <richard@nod.at> Cc: Rich Felker <dalias@libc.org> Cc: Russell King <linux@armlinux.org.uk> Cc: Serge Semin <fancer.lancer@gmail.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tony Luck <tony.luck@intel.com> Cc: Vineet Gupta <vgupta@synopsys.com> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Mike Rapoport 提交于
Make it explicit that the caller gets a physical address rather than a virtual one. This will also allow using meblock_alloc prefix for memblock allocations returning virtual address, which is done in the following patches. The conversion is done using the following semantic patch: @@ expression e1, e2, e3; @@ ( - memblock_alloc(e1, e2) + memblock_phys_alloc(e1, e2) | - memblock_alloc_nid(e1, e2, e3) + memblock_phys_alloc_nid(e1, e2, e3) | - memblock_alloc_try_nid(e1, e2, e3) + memblock_phys_alloc_try_nid(e1, e2, e3) ) Link: http://lkml.kernel.org/r/1536927045-23536-7-git-send-email-rppt@linux.vnet.ibm.comSigned-off-by: NMike Rapoport <rppt@linux.vnet.ibm.com> Acked-by: NMichal Hocko <mhocko@suse.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Chris Zankel <chris@zankel.net> Cc: "David S. Miller" <davem@davemloft.net> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Greentime Hu <green.hu@gmail.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Guan Xuetao <gxt@pku.edu.cn> Cc: Ingo Molnar <mingo@redhat.com> Cc: "James E.J. Bottomley" <jejb@parisc-linux.org> Cc: Jonas Bonn <jonas@southpole.se> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Ley Foon Tan <lftan@altera.com> Cc: Mark Salter <msalter@redhat.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Matt Turner <mattst88@gmail.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Simek <monstr@monstr.eu> Cc: Palmer Dabbelt <palmer@sifive.com> Cc: Paul Burton <paul.burton@mips.com> Cc: Richard Kuo <rkuo@codeaurora.org> Cc: Richard Weinberger <richard@nod.at> Cc: Rich Felker <dalias@libc.org> Cc: Russell King <linux@armlinux.org.uk> Cc: Serge Semin <fancer.lancer@gmail.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tony Luck <tony.luck@intel.com> Cc: Vineet Gupta <vgupta@synopsys.com> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 22 10月, 2018 1 次提交
-
-
由 Vasily Gorbik 提交于
When the kernel is built with: CONFIG_PREEMPT=y CONFIG_PREEMPT_COUNT=y "stfle" function used by kasan initialization code makes additional call to preempt_count_add/preempt_count_sub. To avoid removing kasan instrumentation from sched code where those functions leave split stfle function and provide __stfle variant without preemption handling to be used by Kasan. Reported-by: NBenjamin Block <bblock@linux.ibm.com> Acked-by: NHeiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: NVasily Gorbik <gor@linux.ibm.com> Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
-
- 09 10月, 2018 14 次提交
-
-
由 Vasily Gorbik 提交于
Handle mem= kernel parameter in kasan to limit physical memory. Reviewed-by: NMartin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: NVasily Gorbik <gor@linux.ibm.com> Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
-
由 Vasily Gorbik 提交于
Kasan implementation now supports memory hotplug operations. For that reason regions of initially standby memory are now skipped from shadow mapping and are mapped/unmapped dynamically upon bringing memory online/offline. Signed-off-by: NVasily Gorbik <gor@linux.ibm.com> Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
-
由 Vasily Gorbik 提交于
Kasan early memory allocator simply chops off memory blocks from the end of the physical memory. Reuse mem_detect info to identify actual online memory end rather than using max_physmem_end. This allows to run the kernel with kasan enabled and standby memory defined. Signed-off-by: NVasily Gorbik <gor@linux.ibm.com> Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
-
由 Vasily Gorbik 提交于
This allows to print multiple markers when they happened to have the same value. ... 0x001bfffff0100000-0x001c000000000000 255M PMD I ---[ Kasan Shadow End ]--- ---[ vmemmap Area ]--- 0x001c000000000000-0x001c000002000000 32M PMD RW X ... Reviewed-by: NMartin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: NVasily Gorbik <gor@linux.ibm.com> Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
-
由 Vasily Gorbik 提交于
Kasan zero p4d/pud/pmd/pte are always filled in with corresponding kasan zero entries. Walking kasan zero page backed area is time consuming and unnecessary. When kasan zero p4d/pud/pmd is encountered, it eventually points to the kasan zero page always with the same attributes and nothing but it, therefore zero p4d/pud/pmd could be jumped over. Also adds a space between address range and pages number to separate them from each other when pages number is huge. 0x0018000000000000-0x0018000010000000 256M PMD RW X 0x0018000010000000-0x001bfffff0000000 1073741312M PTE RO X 0x001bfffff0000000-0x001bfffff0001000 4K PTE RW X Reviewed-by: NMartin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: NVasily Gorbik <gor@linux.ibm.com> Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
-
由 Vasily Gorbik 提交于
By default 3-level paging is used when the kernel is compiled with kasan support. Add 4-level paging option to support systems with more then 3TB of physical memory and to cover 4-level paging specific code with kasan as well. Reviewed-by: NMartin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: NVasily Gorbik <gor@linux.ibm.com> Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
-
由 Vasily Gorbik 提交于
Kasan initialization code is changed to populate persistent shadow first, save allocator position into pgalloc_freeable and proceed with early identity mapping creation. This way early identity mapping paging structures could be freed at once after switching to swapper_pg_dir when early identity mapping is not needed anymore. Reviewed-by: NMartin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: NVasily Gorbik <gor@linux.ibm.com> Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
-
由 Vasily Gorbik 提交于
To lower memory footprint and speed up kasan initialisation detect EDAT availability and use large pages if possible. As we know how much memory is needed for initialisation, another simplistic large page allocator is introduced to avoid memory fragmentation. Since facilities list is retrieved anyhow, detect noexec support and adjust pages attributes. Handle noexec kernel option to avoid inconsistent kasan shadow memory pages flags. Signed-off-by: NVasily Gorbik <gor@linux.ibm.com> Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
-
由 Vasily Gorbik 提交于
Move from modules area entire shadow memory preallocation to dynamic allocation per module load. This behaivior has been introduced for x86 with bebf56a1: "This patch also forces module_alloc() to return 8*PAGE_SIZE aligned address making shadow memory handling ( kasan_module_alloc()/kasan_module_free() ) more simple. Such alignment guarantees that each shadow page backing modules address space correspond to only one module_alloc() allocation" Signed-off-by: NVasily Gorbik <gor@linux.ibm.com> Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
-
由 Vasily Gorbik 提交于
This change adds address space markers for kasan shadow memory. Reviewed-by: NMartin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: NVasily Gorbik <gor@linux.ibm.com> Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
-
由 Vasily Gorbik 提交于
Kasan needs 1/8 of kernel virtual address space to be reserved as the shadow area. And eventually it requires the shadow memory offset to be known at compile time (passed to the compiler when full instrumentation is enabled). Any value picked as the shadow area offset for 3-level paging would eat up identity mapping on 4-level paging (with 1PB shadow area size). So, the kernel sticks to 3-level paging when kasan is enabled. 3TB border is picked as the shadow offset. The memory layout is adjusted so, that physical memory border does not exceed KASAN_SHADOW_START and vmemmap does not go below KASAN_SHADOW_END. Due to the fact that on s390 paging is set up very late and to cover more code with kasan instrumentation, temporary identity mapping and final shadow memory are set up early. The shadow memory mapping is later carried over to init_mm.pgd during paging_init. For the needs of paging structures allocation and shadow memory population a primitive allocator is used, which simply chops off memory blocks from the end of the physical memory. Kasan currenty doesn't track vmemmap and vmalloc areas. Current memory layout (for 3-level paging, 2GB physical memory). ---[ Identity Mapping ]--- 0x0000000000000000-0x0000000000100000 ---[ Kernel Image Start ]--- 0x0000000000100000-0x0000000002b00000 ---[ Kernel Image End ]--- 0x0000000002b00000-0x0000000080000000 2G <- physical memory border 0x0000000080000000-0x0000030000000000 3070G PUD I ---[ Kasan Shadow Start ]--- 0x0000030000000000-0x0000030010000000 256M PMD RW X <- shadow for 2G memory 0x0000030010000000-0x0000037ff0000000 523776M PTE RO NX <- kasan zero ro page 0x0000037ff0000000-0x0000038000000000 256M PMD RW X <- shadow for 2G modules ---[ Kasan Shadow End ]--- 0x0000038000000000-0x000003d100000000 324G PUD I ---[ vmemmap Area ]--- 0x000003d100000000-0x000003e080000000 ---[ vmalloc Area ]--- 0x000003e080000000-0x000003ff80000000 ---[ Modules Area ]--- 0x000003ff80000000-0x0000040000000000 2G Acked-by: NMartin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: NVasily Gorbik <gor@linux.ibm.com> Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
-
由 Vasily Gorbik 提交于
Move memory detection to early boot phase. To store online memory regions "struct mem_detect_info" has been introduced together with for_each_mem_detect_block iterator. mem_detect_info is later converted to memblock. Also introduces sclp_early_get_meminfo function to get maximum physical memory and maximum increment number. Reviewed-by: NHeiko Carstens <heiko.carstens@de.ibm.com> Reviewed-by: NMartin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: NVasily Gorbik <gor@linux.ibm.com> Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
-
由 Martin Schwidefsky 提交于
With virtually mapped kernel stacks the kernel stack overflow detection is now fault based, every stack has a guard page in the vmalloc space. The panic_stack is renamed to nodat_stack and is used for all function that need to run without DAT, e.g. memcpy_real or do_start_kdump. The main effect is a reduction in the kernel image size as with vmap stacks the old style overflow checking that adds two instructions per function is not needed anymore. Result from bloat-o-meter: add/remove: 20/1 grow/shrink: 13/26854 up/down: 2198/-216240 (-214042) In regard to performance the micro-benchmark for fork has a hit of a few microseconds, allocating 4 pages in vmalloc space is more expensive compare to an order-2 page allocation. But with real workload I could not find a noticeable difference. Acked-by: NHeiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
-
由 Martin Schwidefsky 提交于
With CONFIG_VMAP_STACK=y the stack is allocated from the vmalloc space. Data structures passed to a hardware or a hypervisor interface that requires V=R can not be allocated on the stack anymore. Make the init and fini pfault parameter blocks static variables. Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
-
- 01 10月, 2018 1 次提交
-
-
由 David Hildenbrand 提交于
Right now we temporarily take the page table lock in gmap_pmd_op_walk() even though we know we won't need it (if we can never have 1mb pages mapped into the gmap). Let's make this a special case, so gmap_protect_range() and gmap_sync_dirty_log_pmd() will not take the lock when huge pages are not allowed. gmap_protect_range() is called quite frequently for managing shadow page tables in vSIE environments. Signed-off-by: NDavid Hildenbrand <david@redhat.com> Reviewed-by: NJanosch Frank <frankja@linux.ibm.com> Message-Id: <20180806155407.15252-1-david@redhat.com> Signed-off-by: NJanosch Frank <frankja@linux.ibm.com>
-
- 12 9月, 2018 1 次提交
-
-
由 Janosch Frank 提交于
Userspace could have munmapped the area before doing unmapping from the gmap. This would leave us with a valid vmaddr, but an invalid vma from which we would try to zap memory. Let's check before using the vma. Fixes: 1e133ab2 ("s390/mm: split arch/s390/mm/pgtable.c") Signed-off-by: NJanosch Frank <frankja@linux.ibm.com> Reviewed-by: NDavid Hildenbrand <david@redhat.com> Reported-by: NDan Carpenter <dan.carpenter@oracle.com> Message-Id: <20180816082432.78828-1-frankja@linux.ibm.com> Signed-off-by: NJanosch Frank <frankja@linux.ibm.com>
-
- 18 8月, 2018 1 次提交
-
-
由 Souptick Joarder 提交于
Use new return type vm_fault_t for fault handler. For now, this is just documenting that the function returns a VM_FAULT value rather than an errno. Once all instances are converted, vm_fault_t will become a distinct type. Ref-> commit 1c8f4220 ("mm: change return type to vm_fault_t") In this patch all the caller of handle_mm_fault() are changed to return vm_fault_t type. Link: http://lkml.kernel.org/r/20180617084810.GA6730@jordon-HP-15-Notebook-PCSigned-off-by: NSouptick Joarder <jrdr.linux@gmail.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Richard Henderson <rth@twiddle.net> Cc: Tony Luck <tony.luck@intel.com> Cc: Matt Turner <mattst88@gmail.com> Cc: Vineet Gupta <vgupta@synopsys.com> Cc: Russell King <linux@armlinux.org.uk> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Richard Kuo <rkuo@codeaurora.org> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Michal Simek <monstr@monstr.eu> Cc: James Hogan <jhogan@kernel.org> Cc: Ley Foon Tan <lftan@altera.com> Cc: Jonas Bonn <jonas@southpole.se> Cc: James E.J. Bottomley <jejb@parisc-linux.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Palmer Dabbelt <palmer@sifive.com> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Cc: David S. Miller <davem@davemloft.net> Cc: Richard Weinberger <richard@nod.at> Cc: Guan Xuetao <gxt@pku.edu.cn> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: "Levin, Alexander (Sasha Levin)" <alexander.levin@verizon.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 09 8月, 2018 1 次提交
-
-
由 Gerald Schaefer 提交于
Commit c9b5ad54 "s390/mm: tag normal pages vs pages used in page tables" accidentally changed the logic in arch_set_page_states(), which is used by the suspend/resume code. set_page_stable(page, order) was changed to set_page_stable_dat(page, 0). After this, only the first page of higher order pages will be set to stable, and a write to one of the unstable pages will result in an addressing exception. Fix this by using "order" again, instead of "0". Fixes: c9b5ad54 ("s390/mm: tag normal pages vs pages used in page tables") Cc: stable@vger.kernel.org # 4.14+ Reviewed-by: NHeiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: NGerald Schaefer <gerald.schaefer@de.ibm.com> Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
-