1. 30 1月, 2020 1 次提交
    • G
      s390/mm: fix dynamic pagetable upgrade for hugetlbfs · 5f490a52
      Gerald Schaefer 提交于
      Commit ee71d16d ("s390/mm: make TASK_SIZE independent from the number
      of page table levels") changed the logic of TASK_SIZE and also removed the
      arch_mmap_check() implementation for s390. This combination has a subtle
      effect on how get_unmapped_area() for hugetlbfs pages works. It is now
      possible that a user process establishes a hugetlbfs mapping at an address
      above 4 TB, without triggering a dynamic pagetable upgrade from 3 to 4
      levels.
      
      This is because hugetlbfs mappings will not use mm->get_unmapped_area, but
      rather file->f_op->get_unmapped_area, which currently is the generic
      implementation of hugetlb_get_unmapped_area() that does not know about s390
      dynamic pagetable upgrades, but with the new definition of TASK_SIZE, it
      will now allow mappings above 4 TB.
      
      Subsequent access to such a mapped address above 4 TB will result in a page
      fault loop, because the CPU cannot translate such a large address with 3
      pagetable levels. The fault handler will try to map in a hugepage at the
      address, but due to the folded pagetable logic it will end up with creating
      entries in the 3 level pagetable, possibly overwriting existing mappings,
      and then it all repeats when the access is retried.
      
      Apart from the page fault loop, this can have various nasty effects, e.g.
      kernel panic from one of the BUG_ON() checks in memory management code,
      or even data loss if an existing mapping gets overwritten.
      
      Fix this by implementing HAVE_ARCH_HUGETLB_UNMAPPED_AREA support for s390,
      providing an s390 version for hugetlb_get_unmapped_area() with pagetable
      upgrade support similar to arch_get_unmapped_area(), which will then be
      used instead of the generic version.
      
      Fixes: ee71d16d ("s390/mm: make TASK_SIZE independent from the number of page table levels")
      Cc: <stable@vger.kernel.org> # 4.12+
      Signed-off-by: NGerald Schaefer <gerald.schaefer@de.ibm.com>
      Signed-off-by: NVasily Gorbik <gor@linux.ibm.com>
      5f490a52
  2. 05 1月, 2020 1 次提交
    • D
      mm/memory_hotplug: shrink zones when offlining memory · feee6b29
      David Hildenbrand 提交于
      We currently try to shrink a single zone when removing memory.  We use
      the zone of the first page of the memory we are removing.  If that
      memmap was never initialized (e.g., memory was never onlined), we will
      read garbage and can trigger kernel BUGs (due to a stale pointer):
      
          BUG: unable to handle page fault for address: 000000000000353d
          #PF: supervisor write access in kernel mode
          #PF: error_code(0x0002) - not-present page
          PGD 0 P4D 0
          Oops: 0002 [#1] SMP PTI
          CPU: 1 PID: 7 Comm: kworker/u8:0 Not tainted 5.3.0-rc5-next-20190820+ #317
          Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.4
          Workqueue: kacpi_hotplug acpi_hotplug_work_fn
          RIP: 0010:clear_zone_contiguous+0x5/0x10
          Code: 48 89 c6 48 89 c3 e8 2a fe ff ff 48 85 c0 75 cf 5b 5d c3 c6 85 fd 05 00 00 01 5b 5d c3 0f 1f 840
          RSP: 0018:ffffad2400043c98 EFLAGS: 00010246
          RAX: 0000000000000000 RBX: 0000000200000000 RCX: 0000000000000000
          RDX: 0000000000200000 RSI: 0000000000140000 RDI: 0000000000002f40
          RBP: 0000000140000000 R08: 0000000000000000 R09: 0000000000000001
          R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000140000
          R13: 0000000000140000 R14: 0000000000002f40 R15: ffff9e3e7aff3680
          FS:  0000000000000000(0000) GS:ffff9e3e7bb00000(0000) knlGS:0000000000000000
          CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
          CR2: 000000000000353d CR3: 0000000058610000 CR4: 00000000000006e0
          DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
          DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
          Call Trace:
           __remove_pages+0x4b/0x640
           arch_remove_memory+0x63/0x8d
           try_remove_memory+0xdb/0x130
           __remove_memory+0xa/0x11
           acpi_memory_device_remove+0x70/0x100
           acpi_bus_trim+0x55/0x90
           acpi_device_hotplug+0x227/0x3a0
           acpi_hotplug_work_fn+0x1a/0x30
           process_one_work+0x221/0x550
           worker_thread+0x50/0x3b0
           kthread+0x105/0x140
           ret_from_fork+0x3a/0x50
          Modules linked in:
          CR2: 000000000000353d
      
      Instead, shrink the zones when offlining memory or when onlining failed.
      Introduce and use remove_pfn_range_from_zone(() for that.  We now
      properly shrink the zones, even if we have DIMMs whereby
      
       - Some memory blocks fall into no zone (never onlined)
      
       - Some memory blocks fall into multiple zones (offlined+re-onlined)
      
       - Multiple memory blocks that fall into different zones
      
      Drop the zone parameter (with a potential dubious value) from
      __remove_pages() and __remove_section().
      
      Link: http://lkml.kernel.org/r/20191006085646.5768-6-david@redhat.com
      Fixes: f1dd2cd1 ("mm, memory_hotplug: do not associate hotadded memory to zones until online")	[visible after d0dc12e8]
      Signed-off-by: NDavid Hildenbrand <david@redhat.com>
      Reviewed-by: NOscar Salvador <osalvador@suse.de>
      Cc: Michal Hocko <mhocko@suse.com>
      Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org>
      Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com>
      Cc: Pavel Tatashin <pasha.tatashin@soleen.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: Logan Gunthorpe <logang@deltatee.com>
      Cc: <stable@vger.kernel.org>	[5.0+]
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      feee6b29
  3. 12 12月, 2019 1 次提交
  4. 30 11月, 2019 1 次提交
  5. 20 11月, 2019 1 次提交
    • V
      s390/kasan: support memcpy_real with TRACE_IRQFLAGS · 13f9bae5
      Vasily Gorbik 提交于
      Currently if the kernel is built with CONFIG_TRACE_IRQFLAGS and KASAN
      and used as crash kernel it crashes itself due to
      trace_hardirqs_off/trace_hardirqs_on being called with DAT off. This
      happens because trace_hardirqs_off/trace_hardirqs_on are instrumented and
      kasan code tries to perform access to shadow memory to validate memory
      accesses. Kasan shadow memory is populated with vmemmap, so all accesses
      require DAT on.
      
      memcpy_real could be called with DAT on or off (with kasan enabled DAT
      is set even before early code is executed).
      
      Make sure that trace_hardirqs_off/trace_hardirqs_on are called with DAT
      on and only actual __memcpy_real is called with DAT off.
      
      Also annotate __memcpy_real and _memcpy_real with __no_sanitize_address
      to avoid further problems due to switching DAT off.
      Reviewed-by: NPhilipp Rudo <prudo@linux.ibm.com>
      Signed-off-by: NVasily Gorbik <gor@linux.ibm.com>
      13f9bae5
  6. 01 11月, 2019 2 次提交
  7. 27 9月, 2019 1 次提交
    • M
      mm: treewide: clarify pgtable_page_{ctor,dtor}() naming · b4ed71f5
      Mark Rutland 提交于
      The naming of pgtable_page_{ctor,dtor}() seems to have confused a few
      people, and until recently arm64 used these erroneously/pointlessly for
      other levels of page table.
      
      To make it incredibly clear that these only apply to the PTE level, and to
      align with the naming of pgtable_pmd_page_{ctor,dtor}(), let's rename them
      to pgtable_pte_page_{ctor,dtor}().
      
      These changes were generated with the following shell script:
      
      ----
      git grep -lw 'pgtable_page_.tor' | while read FILE; do
          sed -i '{s/pgtable_page_ctor/pgtable_pte_page_ctor/}' $FILE;
          sed -i '{s/pgtable_page_dtor/pgtable_pte_page_dtor/}' $FILE;
      done
      ----
      
      ... with the documentation re-flowed to remain under 80 columns, and
      whitespace fixed up in macros to keep backslashes aligned.
      
      There should be no functional change as a result of this patch.
      
      Link: http://lkml.kernel.org/r/20190722141133.3116-1-mark.rutland@arm.comSigned-off-by: NMark Rutland <mark.rutland@arm.com>
      Reviewed-by: NMike Rapoport <rppt@linux.ibm.com>
      Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>	[m68k]
      Cc: Anshuman Khandual <anshuman.khandual@arm.com>
      Cc: Matthew Wilcox <willy@infradead.org>
      Cc: Michal Hocko <mhocko@suse.com>
      Cc: Yu Zhao <yuzhao@google.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      b4ed71f5
  8. 07 9月, 2019 2 次提交
  9. 29 8月, 2019 1 次提交
    • V
      s390/kasan: add kdump support · 042c1d29
      Vasily Gorbik 提交于
      If kasan enabled kernel is used as crash kernel it crashes itself with
      program check loop during kdump execution. The reason for that is that
      kasan shadow memory backed by pages beyond OLDMEM_SIZE. Make kasan memory
      allocator respect physical memory limit imposed by kdump.
      Signed-off-by: NVasily Gorbik <gor@linux.ibm.com>
      042c1d29
  10. 26 8月, 2019 2 次提交
  11. 21 8月, 2019 2 次提交
  12. 09 8月, 2019 1 次提交
  13. 06 8月, 2019 1 次提交
  14. 30 7月, 2019 2 次提交
  15. 26 7月, 2019 1 次提交
  16. 19 7月, 2019 3 次提交
    • D
      mm/memory_hotplug: allow arch_remove_memory() without CONFIG_MEMORY_HOTREMOVE · 80ec922d
      David Hildenbrand 提交于
      We want to improve error handling while adding memory by allowing to use
      arch_remove_memory() and __remove_pages() even if
      CONFIG_MEMORY_HOTREMOVE is not set to e.g., implement something like:
      
      	arch_add_memory()
      	rc = do_something();
      	if (rc) {
      		arch_remove_memory();
      	}
      
      We won't get rid of CONFIG_MEMORY_HOTREMOVE for now, as it will require
      quite some dependencies for memory offlining.
      
      Link: http://lkml.kernel.org/r/20190527111152.16324-7-david@redhat.comSigned-off-by: NDavid Hildenbrand <david@redhat.com>
      Reviewed-by: NPavel Tatashin <pasha.tatashin@soleen.com>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
      Cc: Rich Felker <dalias@libc.org>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: "Rafael J. Wysocki" <rafael@kernel.org>
      Cc: Michal Hocko <mhocko@suse.com>
      Cc: David Hildenbrand <david@redhat.com>
      Cc: Oscar Salvador <osalvador@suse.com>
      Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
      Cc: Alex Deucher <alexander.deucher@amd.com>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Mark Brown <broonie@kernel.org>
      Cc: Chris Wilson <chris@chris-wilson.co.uk>
      Cc: Christophe Leroy <christophe.leroy@c-s.fr>
      Cc: Nicholas Piggin <npiggin@gmail.com>
      Cc: Vasily Gorbik <gor@linux.ibm.com>
      Cc: Rob Herring <robh@kernel.org>
      Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
      Cc: "mike.travis@hpe.com" <mike.travis@hpe.com>
      Cc: Andrew Banman <andrew.banman@hpe.com>
      Cc: Arun KS <arunks@codeaurora.org>
      Cc: Qian Cai <cai@lca.pw>
      Cc: Mathieu Malaterre <malat@debian.org>
      Cc: Baoquan He <bhe@redhat.com>
      Cc: Logan Gunthorpe <logang@deltatee.com>
      Cc: Anshuman Khandual <anshuman.khandual@arm.com>
      Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Chintan Pandya <cpandya@codeaurora.org>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Cc: Jun Yao <yaojun8558363@gmail.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>
      Cc: Oscar Salvador <osalvador@suse.de>
      Cc: Robin Murphy <robin.murphy@arm.com>
      Cc: Wei Yang <richard.weiyang@gmail.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Yu Zhao <yuzhao@google.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      80ec922d
    • D
      s390x/mm: implement arch_remove_memory() · 18c86506
      David Hildenbrand 提交于
      Will come in handy when wanting to handle errors after
      arch_add_memory().
      
      Link: http://lkml.kernel.org/r/20190527111152.16324-4-david@redhat.comSigned-off-by: NDavid Hildenbrand <david@redhat.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Michal Hocko <mhocko@suse.com>
      Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>
      Cc: David Hildenbrand <david@redhat.com>
      Cc: Vasily Gorbik <gor@linux.ibm.com>
      Cc: Oscar Salvador <osalvador@suse.com>
      Cc: Alex Deucher <alexander.deucher@amd.com>
      Cc: Andrew Banman <andrew.banman@hpe.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Anshuman Khandual <anshuman.khandual@arm.com>
      Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
      Cc: Arun KS <arunks@codeaurora.org>
      Cc: Baoquan He <bhe@redhat.com>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Chintan Pandya <cpandya@codeaurora.org>
      Cc: Christophe Leroy <christophe.leroy@c-s.fr>
      Cc: Chris Wilson <chris@chris-wilson.co.uk>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Cc: Jun Yao <yaojun8558363@gmail.com>
      Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
      Cc: Logan Gunthorpe <logang@deltatee.com>
      Cc: Mark Brown <broonie@kernel.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
      Cc: Mathieu Malaterre <malat@debian.org>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Mike Rapoport <rppt@linux.ibm.com>
      Cc: "mike.travis@hpe.com" <mike.travis@hpe.com>
      Cc: Nicholas Piggin <npiggin@gmail.com>
      Cc: Oscar Salvador <osalvador@suse.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Pavel Tatashin <pasha.tatashin@soleen.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Qian Cai <cai@lca.pw>
      Cc: "Rafael J. Wysocki" <rafael@kernel.org>
      Cc: Rich Felker <dalias@libc.org>
      Cc: Rob Herring <robh@kernel.org>
      Cc: Robin Murphy <robin.murphy@arm.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Wei Yang <richard.weiyang@gmail.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
      Cc: Yu Zhao <yuzhao@google.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      18c86506
    • D
      s390x/mm: fail when an altmap is used for arch_add_memory() · 973de24a
      David Hildenbrand 提交于
      ZONE_DEVICE is not yet supported, fail if an altmap is passed, so we
      don't forget arch_add_memory()/arch_remove_memory() when unlocking
      support.
      
      Link: http://lkml.kernel.org/r/20190527111152.16324-3-david@redhat.comSigned-off-by: NDavid Hildenbrand <david@redhat.com>
      Suggested-by: NDan Williams <dan.j.williams@intel.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Michal Hocko <mhocko@suse.com>
      Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>
      Cc: David Hildenbrand <david@redhat.com>
      Cc: Vasily Gorbik <gor@linux.ibm.com>
      Cc: Oscar Salvador <osalvador@suse.com>
      Cc: Alex Deucher <alexander.deucher@amd.com>
      Cc: Andrew Banman <andrew.banman@hpe.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Anshuman Khandual <anshuman.khandual@arm.com>
      Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
      Cc: Arun KS <arunks@codeaurora.org>
      Cc: Baoquan He <bhe@redhat.com>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Chintan Pandya <cpandya@codeaurora.org>
      Cc: Christophe Leroy <christophe.leroy@c-s.fr>
      Cc: Chris Wilson <chris@chris-wilson.co.uk>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Cc: Jun Yao <yaojun8558363@gmail.com>
      Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
      Cc: Logan Gunthorpe <logang@deltatee.com>
      Cc: Mark Brown <broonie@kernel.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
      Cc: Mathieu Malaterre <malat@debian.org>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Mike Rapoport <rppt@linux.ibm.com>
      Cc: "mike.travis@hpe.com" <mike.travis@hpe.com>
      Cc: Nicholas Piggin <npiggin@gmail.com>
      Cc: Oscar Salvador <osalvador@suse.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Pavel Tatashin <pasha.tatashin@soleen.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Qian Cai <cai@lca.pw>
      Cc: "Rafael J. Wysocki" <rafael@kernel.org>
      Cc: Rich Felker <dalias@libc.org>
      Cc: Rob Herring <robh@kernel.org>
      Cc: Robin Murphy <robin.murphy@arm.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Wei Yang <richard.weiyang@gmail.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
      Cc: Yu Zhao <yuzhao@google.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      973de24a
  17. 17 7月, 2019 2 次提交
    • A
      mm, kprobes: generalize and rename notify_page_fault() as kprobe_page_fault() · b98cca44
      Anshuman Khandual 提交于
      Architectures which support kprobes have very similar boilerplate around
      calling kprobe_fault_handler().  Use a helper function in kprobes.h to
      unify them, based on the x86 code.
      
      This changes the behaviour for other architectures when preemption is
      enabled.  Previously, they would have disabled preemption while calling
      the kprobe handler.  However, preemption would be disabled if this fault
      was due to a kprobe, so we know the fault was not due to a kprobe
      handler and can simply return failure.
      
      This behaviour was introduced in commit a980c0ef ("x86/kprobes:
      Refactor kprobes_fault() like kprobe_exceptions_notify()")
      
      [anshuman.khandual@arm.com: export kprobe_fault_handler()]
        Link: http://lkml.kernel.org/r/1561133358-8876-1-git-send-email-anshuman.khandual@arm.com
      Link: http://lkml.kernel.org/r/1560420444-25737-1-git-send-email-anshuman.khandual@arm.comSigned-off-by: NAnshuman Khandual <anshuman.khandual@arm.com>
      Reviewed-by: NDave Hansen <dave.hansen@linux.intel.com>
      Cc: Michal Hocko <mhocko@suse.com>
      Cc: Matthew Wilcox <willy@infradead.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Christophe Leroy <christophe.leroy@c-s.fr>
      Cc: Stephen Rothwell <sfr@canb.auug.org.au>
      Cc: Andrey Konovalov <andreyknvl@google.com>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Russell King <linux@armlinux.org.uk>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Vineet Gupta <vgupta@synopsys.com>
      Cc: James Hogan <jhogan@kernel.org>
      Cc: Paul Burton <paul.burton@mips.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      b98cca44
    • T
      dma-direct: Force unencrypted DMA under SME for certain DMA masks · 9087c375
      Tom Lendacky 提交于
      If a device doesn't support DMA to a physical address that includes the
      encryption bit (currently bit 47, so 48-bit DMA), then the DMA must
      occur to unencrypted memory. SWIOTLB is used to satisfy that requirement
      if an IOMMU is not active (enabled or configured in passthrough mode).
      
      However, commit fafadcd1 ("swiotlb: don't dip into swiotlb pool for
      coherent allocations") modified the coherent allocation support in
      SWIOTLB to use the DMA direct coherent allocation support. When an IOMMU
      is not active, this resulted in dma_alloc_coherent() failing for devices
      that didn't support DMA addresses that included the encryption bit.
      
      Addressing this requires changes to the force_dma_unencrypted() function
      in kernel/dma/direct.c. Since the function is now non-trivial and
      SME/SEV specific, update the DMA direct support to add an arch override
      for the force_dma_unencrypted() function. The arch override is selected
      when CONFIG_AMD_MEM_ENCRYPT is set. The arch override function resides in
      the arch/x86/mm/mem_encrypt.c file and forces unencrypted DMA when either
      SEV is active or SME is active and the device does not support DMA to
      physical addresses that include the encryption bit.
      
      Fixes: fafadcd1 ("swiotlb: don't dip into swiotlb pool for coherent allocations")
      Suggested-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NTom Lendacky <thomas.lendacky@amd.com>
      Acked-by: NThomas Gleixner <tglx@linutronix.de>
      [hch: moved the force_dma_unencrypted declaration to dma-mapping.h,
            fold the s390 fix from Halil Pasic]
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      9087c375
  18. 15 6月, 2019 1 次提交
  19. 05 6月, 2019 1 次提交
  20. 04 6月, 2019 2 次提交
  21. 29 5月, 2019 1 次提交
    • E
      signal: Remove the task parameter from force_sig_fault · 2e1661d2
      Eric W. Biederman 提交于
      As synchronous exceptions really only make sense against the current
      task (otherwise how are you synchronous) remove the task parameter
      from from force_sig_fault to make it explicit that is what is going
      on.
      
      The two known exceptions that deliver a synchronous exception to a
      stopped ptraced task have already been changed to
      force_sig_fault_to_task.
      
      The callers have been changed with the following emacs regular expression
      (with obvious variations on the architectures that take more arguments)
      to avoid typos:
      
      force_sig_fault[(]\([^,]+\)[,]\([^,]+\)[,]\([^,]+\)[,]\W+current[)]
      ->
      force_sig_fault(\1,\2,\3)
      Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>
      2e1661d2
  22. 28 5月, 2019 1 次提交
    • M
      s390: add unreachable() to dump_fault_info() to fix -Wmaybe-uninitialized · bf2f1eee
      Masahiro Yamada 提交于
      When CONFIG_OPTIMIZE_INLINING is enabled for s390, I see this warning:
      
      arch/s390/mm/fault.c:127:15: warning: 'asce' may be used uninitialized in this function [-Wmaybe-uninitialized]
        switch (asce & _ASCE_TYPE_MASK) {
      arch/s390/mm/fault.c:177:16: note: 'asce' was declared here
        unsigned long asce;
                      ^~~~
      
      If get_fault_type() is not inlined, the compiler cannot deduce that
      all the possible paths in the 'switch' statement are covered.
      
      Of course, we could mark get_fault_type() as __always_inline to get
      back the original behavior, but I do not think it sensible to force
      inlining just for the purpose of suppressing the warning. Since this
      is just a matter of warning, I want to keep as much room for compiler
      optimization as possible.
      
      I added unreachable() to teach the compiler that the 'default' label
      is unreachable.
      
      I got rid of the 'inline' marker. Even without the 'inline' hint,
      the compiler inlines functions based on its inlining heuristic.
      
      Fixes: 9012d011 ("compiler: allow all arches to enable CONFIG_OPTIMIZE_INLINING")
      Signed-off-by: NMasahiro Yamada <yamada.masahiro@socionext.com>
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
      bf2f1eee
  23. 15 5月, 2019 3 次提交
    • D
      mm/memory_hotplug: make __remove_pages() and arch_remove_memory() never fail · ac5c9426
      David Hildenbrand 提交于
      All callers of arch_remove_memory() ignore errors.  And we should really
      try to remove any errors from the memory removal path.  No more errors are
      reported from __remove_pages().  BUG() in s390x code in case
      arch_remove_memory() is triggered.  We may implement that properly later.
      WARN in case powerpc code failed to remove the section mapping, which is
      better than ignoring the error completely right now.
      
      Link: http://lkml.kernel.org/r/20190409100148.24703-5-david@redhat.comSigned-off-by: NDavid Hildenbrand <david@redhat.com>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
      Cc: Rich Felker <dalias@libc.org>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Michal Hocko <mhocko@suse.com>
      Cc: Mike Rapoport <rppt@linux.ibm.com>
      Cc: Oscar Salvador <osalvador@suse.com>
      Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
      Cc: Christophe Leroy <christophe.leroy@c-s.fr>
      Cc: Stefan Agner <stefan@agner.ch>
      Cc: Nicholas Piggin <npiggin@gmail.com>
      Cc: Pavel Tatashin <pasha.tatashin@soleen.com>
      Cc: Vasily Gorbik <gor@linux.ibm.com>
      Cc: Arun KS <arunks@codeaurora.org>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
      Cc: Rob Herring <robh@kernel.org>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Cc: Wei Yang <richard.weiyang@gmail.com>
      Cc: Qian Cai <cai@lca.pw>
      Cc: Mathieu Malaterre <malat@debian.org>
      Cc: Andrew Banman <andrew.banman@hpe.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Mike Travis <mike.travis@hpe.com>
      Cc: Oscar Salvador <osalvador@suse.de>
      Cc: "Rafael J. Wysocki" <rafael@kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      ac5c9426
    • M
      mm, memory_hotplug: provide a more generic restrictions for memory hotplug · 940519f0
      Michal Hocko 提交于
      arch_add_memory, __add_pages take a want_memblock which controls whether
      the newly added memory should get the sysfs memblock user API (e.g.
      ZONE_DEVICE users do not want/need this interface).  Some callers even
      want to control where do we allocate the memmap from by configuring
      altmap.
      
      Add a more generic hotplug context for arch_add_memory and __add_pages.
      struct mhp_restrictions contains flags which contains additional features
      to be enabled by the memory hotplug (MHP_MEMBLOCK_API currently) and
      altmap for alternative memmap allocator.
      
      This patch shouldn't introduce any functional change.
      
      [akpm@linux-foundation.org: build fix]
      Link: http://lkml.kernel.org/r/20190408082633.2864-3-osalvador@suse.deSigned-off-by: NMichal Hocko <mhocko@suse.com>
      Signed-off-by: NOscar Salvador <osalvador@suse.de>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: David Hildenbrand <david@redhat.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      940519f0
    • C
      initramfs: poison freed initrd memory · f94f7434
      Christoph Hellwig 提交于
      Various architectures including x86 poison the freed initrd memory.  Do
      the same in the generic free_initrd_mem implementation and switch a few
      more architectures that are identical to the generic code over to it now.
      
      Link: http://lkml.kernel.org/r/20190213174621.29297-9-hch@lst.deSigned-off-by: NChristoph Hellwig <hch@lst.de>
      Acked-by: NMike Rapoport <rppt@linux.ibm.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>	[arm64]
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>	[m68k]
      Cc: Steven Price <steven.price@arm.com>
      Cc: Alexander Viro <viro@zeniv.linux.org.uk>
      Cc: Guan Xuetao <gxt@pku.edu.cn>
      Cc: Russell King <linux@armlinux.org.uk>
      Cc: Will Deacon <will.deacon@arm.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      f94f7434
  24. 08 5月, 2019 1 次提交
  25. 02 5月, 2019 1 次提交
    • M
      s390/unwind: introduce stack unwind API · 78c98f90
      Martin Schwidefsky 提交于
      Rework the dump_trace() stack unwinder interface to support different
      unwinding algorithms. The new interface looks like this:
      
      	struct unwind_state state;
      	unwind_for_each_frame(&state, task, regs, start_stack)
      		do_something(state.sp, state.ip, state.reliable);
      
      The unwind_bc.c file contains the implementation for the classic
      back-chain unwinder.
      
      One positive side effect of the new code is it now handles ftraced
      functions gracefully. It prints the real name of the return function
      instead of 'return_to_handler'.
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      78c98f90
  26. 29 4月, 2019 2 次提交
    • G
      locking/lockdep: check for freed initmem in static_obj() · 7a5da02d
      Gerald Schaefer 提交于
      The following warning occurred on s390:
      WARNING: CPU: 0 PID: 804 at kernel/locking/lockdep.c:1025 lockdep_register_key+0x30/0x150
      
      This is because the check in static_obj() assumes that all memory within
      [_stext, _end] belongs to static objects, which at least for s390 isn't
      true. The init section is also part of this range, and freeing it allows
      the buddy allocator to allocate memory from it. We have virt == phys for
      the kernel on s390, so that such allocations would then have addresses
      within the range [_stext, _end].
      
      To fix this, introduce arch_is_kernel_initmem_freed(), similar to
      arch_is_kernel_text/data(), and add it to the checks in static_obj().
      This will always return 0 on architectures that do not define
      arch_is_kernel_initmem_freed. On s390, it will return 1 if initmem has
      been freed and the address is in the range [__init_begin, __init_end].
      Signed-off-by: NGerald Schaefer <gerald.schaefer@de.ibm.com>
      Reviewed-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      7a5da02d
    • G
      s390/kernel: introduce .dma sections · a80313ff
      Gerald Schaefer 提交于
      With a relocatable kernel that could reside at any place in memory, code
      and data that has to stay below 2 GB needs special handling.
      
      This patch introduces .dma sections for such text, data and ex_table.
      The sections will be part of the decompressor kernel, so they will not
      be relocated and stay below 2 GB. Their location is passed over to the
      decompressed / relocated kernel via the .boot.preserved.data section.
      
      The duald and aste for control register setup also need to stay below
      2 GB, so move the setup code from arch/s390/kernel/head64.S to
      arch/s390/boot/head.S. The duct and linkage_stack could reside above
      2 GB, but their content has to be preserved for the decompresed kernel,
      so they are also moved into the .dma section.
      
      The start and end address of the .dma sections is added to vmcoreinfo,
      for crash support, to help debugging in case the kernel crashed there.
      Signed-off-by: NGerald Schaefer <gerald.schaefer@de.ibm.com>
      Reviewed-by: NPhilipp Rudo <prudo@linux.ibm.com>
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      a80313ff
  27. 23 4月, 2019 2 次提交
    • M
      s390/mm: convert to the generic get_user_pages_fast code · 1a42010c
      Martin Schwidefsky 提交于
      Define the gup_fast_permitted to check against the asce_limit of the
      mm attached to the current task, then replace the s390 specific gup
      code with the generic implementation in mm/gup.c.
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      1a42010c
    • M
      s390/mm: make the pxd_offset functions more robust · d1874a0c
      Martin Schwidefsky 提交于
      Change the way how pgd_offset, p4d_offset, pud_offset and pmd_offset
      walk the page tables. pgd_offset now always calculates the index for
      the top-level page table and adds it to the pgd, this is either a
      segment table offset for a 2-level setup, a region-3 offset for 3-levels,
      region-2 offset for 4-levels, or a region-1 offset for a 5-level setup.
      The other three functions p4d_offset, pud_offset and pmd_offset will
      only add the respective offset if they dereference the passed pointer.
      
      With the new way of walking the page tables a sequence like this from
      mm/gup.c now works:
      
           pgdp = pgd_offset(current->mm, addr);
           pgd = READ_ONCE(*pgdp);
           p4dp = p4d_offset(&pgd, addr);
           p4d = READ_ONCE(*p4dp);
           pudp = pud_offset(&p4d, addr);
           pud = READ_ONCE(*pudp);
           pmdp = pmd_offset(&pud, addr);
           pmd = READ_ONCE(*pmdp);
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      d1874a0c