1. 15 5月, 2019 7 次提交
    • M
      powerpc/mm/radix: mark as __tlbie_pid() and friends as__always_inline · efc344c5
      Masahiro Yamada 提交于
      This prepares to move CONFIG_OPTIMIZE_INLINING from x86 to a common
      place.  We need to eliminate potential issues beforehand.
      
      If it is enabled for powerpc, the following errors are reported:
      
        arch/powerpc/mm/tlb-radix.c: In function '__tlbie_lpid':
        arch/powerpc/mm/tlb-radix.c:148:2: warning: asm operand 3 probably doesn't match constraints
          asm volatile(PPC_TLBIE_5(%0, %4, %3, %2, %1)
          ^~~
        arch/powerpc/mm/tlb-radix.c:148:2: error: impossible constraint in 'asm'
        arch/powerpc/mm/tlb-radix.c: In function '__tlbie_pid':
        arch/powerpc/mm/tlb-radix.c:118:2: warning: asm operand 3 probably doesn't match constraints
          asm volatile(PPC_TLBIE_5(%0, %4, %3, %2, %1)
          ^~~
        arch/powerpc/mm/tlb-radix.c: In function '__tlbiel_pid':
        arch/powerpc/mm/tlb-radix.c:104:2: warning: asm operand 3 probably doesn't match constraints
          asm volatile(PPC_TLBIEL(%0, %4, %3, %2, %1)
          ^~~
      
      Link: http://lkml.kernel.org/r/20190423034959.13525-11-yamada.masahiro@socionext.comSigned-off-by: NMasahiro Yamada <yamada.masahiro@socionext.com>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Boris Brezillon <bbrezillon@kernel.org>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Brian Norris <computersforpeace@gmail.com>
      Cc: Christophe Leroy <christophe.leroy@c-s.fr>
      Cc: David Woodhouse <dwmw2@infradead.org>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Marek Vasut <marek.vasut@gmail.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Mathieu Malaterre <malat@debian.org>
      Cc: Miquel Raynal <miquel.raynal@bootlin.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Richard Weinberger <richard@nod.at>
      Cc: Russell King <rmk+kernel@arm.linux.org.uk>
      Cc: Stefan Agner <stefan@agner.ch>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      efc344c5
    • M
      powerpc/mm/radix: mark __radix__flush_tlb_range_psize() as __always_inline · e12d6d7d
      Masahiro Yamada 提交于
      This prepares to move CONFIG_OPTIMIZE_INLINING from x86 to a common
      place.  We need to eliminate potential issues beforehand.
      
      If it is enabled for powerpc, the following error is reported:
      
        arch/powerpc/mm/tlb-radix.c: In function '__radix__flush_tlb_range_psize':
        arch/powerpc/mm/tlb-radix.c:104:2: error: asm operand 3 probably doesn't match constraints [-Werror]
          asm volatile(PPC_TLBIEL(%0, %4, %3, %2, %1)
          ^~~
        arch/powerpc/mm/tlb-radix.c:104:2: error: impossible constraint in 'asm'
      
      Link: http://lkml.kernel.org/r/20190423034959.13525-10-yamada.masahiro@socionext.comSigned-off-by: NMasahiro Yamada <yamada.masahiro@socionext.com>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Boris Brezillon <bbrezillon@kernel.org>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Brian Norris <computersforpeace@gmail.com>
      Cc: Christophe Leroy <christophe.leroy@c-s.fr>
      Cc: David Woodhouse <dwmw2@infradead.org>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Marek Vasut <marek.vasut@gmail.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Mathieu Malaterre <malat@debian.org>
      Cc: Miquel Raynal <miquel.raynal@bootlin.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Richard Weinberger <richard@nod.at>
      Cc: Russell King <rmk+kernel@arm.linux.org.uk>
      Cc: Stefan Agner <stefan@agner.ch>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      e12d6d7d
    • M
      powerpc/mm: Fix crashes with hugepages & 4K pages · 7338874c
      Michael Ellerman 提交于
      The recent commit to cleanup ifdefs in the hugepage initialisation led
      to crashes when using 4K pages as reported by Sachin:
      
        BUG: Kernel NULL pointer dereference at 0x0000001c
        Faulting instruction address: 0xc000000001d1e58c
        Oops: Kernel access of bad area, sig: 11 [#1]
        LE PAGE_SIZE=4K MMU=Hash SMP NR_CPUS=2048 NUMA pSeries
        ...
        CPU: 3 PID: 4635 Comm: futex_wake04 Tainted: G        W  O      5.1.0-next-20190507-autotest #1
        NIP:  c000000001d1e58c LR: c000000001d1e54c CTR: 0000000000000000
        REGS: c000000004937890 TRAP: 0300
        MSR:  8000000000009033 <SF,EE,ME,IR,DR,RI,LE>  CR: 22424822  XER: 00000000
        CFAR: c00000000183e9e0 DAR: 000000000000001c DSISR: 40000000 IRQMASK: 0
        ...
        NIP kmem_cache_alloc+0xbc/0x5a0
        LR  kmem_cache_alloc+0x7c/0x5a0
        Call Trace:
          huge_pte_alloc+0x580/0x950
          hugetlb_fault+0x9a0/0x1250
          handle_mm_fault+0x490/0x4a0
          __do_page_fault+0x77c/0x1f00
          do_page_fault+0x28/0x50
          handle_page_fault+0x18/0x38
      
      This is caused by us trying to allocate from a NULL kmem cache in
      __hugepte_alloc(). The kmem cache is NULL because it was never
      allocated in hugetlbpage_init(), because add_huge_page_size() returned
      an error.
      
      The reason add_huge_page_size() returned an error is a simple typo, we
      are calling check_and_get_huge_psize(size) when we should be passing
      shift instead.
      
      The fact that we're able to trigger this path when the kmem caches are
      NULL is a separate bug, ie. we should not advertise any hugepage sizes
      if we haven't setup the required caches for them.
      
      This was only seen with 4K pages, with 64K pages we don't need to
      allocate any extra kmem caches because the 16M hugepage just occupies
      a single entry at the PMD level.
      
      Fixes: 723f268f ("powerpc/mm: cleanup ifdef mess in add_huge_page_size()")
      Reported-by: NSachin Sant <sachinp@linux.ibm.com>
      Tested-by: NSachin Sant <sachinp@linux.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      Reviewed-by: NChristophe Leroy <christophe.leroy@c-s.fr>
      Reviewed-by: NAneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
      7338874c
    • D
      mm/memory_hotplug: make __remove_pages() and arch_remove_memory() never fail · ac5c9426
      David Hildenbrand 提交于
      All callers of arch_remove_memory() ignore errors.  And we should really
      try to remove any errors from the memory removal path.  No more errors are
      reported from __remove_pages().  BUG() in s390x code in case
      arch_remove_memory() is triggered.  We may implement that properly later.
      WARN in case powerpc code failed to remove the section mapping, which is
      better than ignoring the error completely right now.
      
      Link: http://lkml.kernel.org/r/20190409100148.24703-5-david@redhat.comSigned-off-by: NDavid Hildenbrand <david@redhat.com>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
      Cc: Rich Felker <dalias@libc.org>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Michal Hocko <mhocko@suse.com>
      Cc: Mike Rapoport <rppt@linux.ibm.com>
      Cc: Oscar Salvador <osalvador@suse.com>
      Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
      Cc: Christophe Leroy <christophe.leroy@c-s.fr>
      Cc: Stefan Agner <stefan@agner.ch>
      Cc: Nicholas Piggin <npiggin@gmail.com>
      Cc: Pavel Tatashin <pasha.tatashin@soleen.com>
      Cc: Vasily Gorbik <gor@linux.ibm.com>
      Cc: Arun KS <arunks@codeaurora.org>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
      Cc: Rob Herring <robh@kernel.org>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Cc: Wei Yang <richard.weiyang@gmail.com>
      Cc: Qian Cai <cai@lca.pw>
      Cc: Mathieu Malaterre <malat@debian.org>
      Cc: Andrew Banman <andrew.banman@hpe.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Mike Travis <mike.travis@hpe.com>
      Cc: Oscar Salvador <osalvador@suse.de>
      Cc: "Rafael J. Wysocki" <rafael@kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      ac5c9426
    • M
      mm, memory_hotplug: provide a more generic restrictions for memory hotplug · 940519f0
      Michal Hocko 提交于
      arch_add_memory, __add_pages take a want_memblock which controls whether
      the newly added memory should get the sysfs memblock user API (e.g.
      ZONE_DEVICE users do not want/need this interface).  Some callers even
      want to control where do we allocate the memmap from by configuring
      altmap.
      
      Add a more generic hotplug context for arch_add_memory and __add_pages.
      struct mhp_restrictions contains flags which contains additional features
      to be enabled by the memory hotplug (MHP_MEMBLOCK_API currently) and
      altmap for alternative memmap allocator.
      
      This patch shouldn't introduce any functional change.
      
      [akpm@linux-foundation.org: build fix]
      Link: http://lkml.kernel.org/r/20190408082633.2864-3-osalvador@suse.deSigned-off-by: NMichal Hocko <mhocko@suse.com>
      Signed-off-by: NOscar Salvador <osalvador@suse.de>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: David Hildenbrand <david@redhat.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      940519f0
    • C
      initramfs: provide a generic free_initrd_mem implementation · 4afd58e1
      Christoph Hellwig 提交于
      For most architectures free_initrd_mem just expands to the same
      free_reserved_area call.  Provide that as a generic implementation marked
      __weak.
      
      Link: http://lkml.kernel.org/r/20190213174621.29297-8-hch@lst.deSigned-off-by: NChristoph Hellwig <hch@lst.de>
      Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>	[m68k]
      Acked-by: NMike Rapoport <rppt@linux.ibm.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>	[arm64]
      Cc: Steven Price <steven.price@arm.com>
      Cc: Alexander Viro <viro@zeniv.linux.org.uk>
      Cc: Guan Xuetao <gxt@pku.edu.cn>
      Cc: Russell King <linux@armlinux.org.uk>
      Cc: Will Deacon <will.deacon@arm.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      4afd58e1
    • I
      mm/gup: replace get_user_pages_longterm() with FOLL_LONGTERM · 932f4a63
      Ira Weiny 提交于
      Pach series "Add FOLL_LONGTERM to GUP fast and use it".
      
      HFI1, qib, and mthca, use get_user_pages_fast() due to its performance
      advantages.  These pages can be held for a significant time.  But
      get_user_pages_fast() does not protect against mapping FS DAX pages.
      
      Introduce FOLL_LONGTERM and use this flag in get_user_pages_fast() which
      retains the performance while also adding the FS DAX checks.  XDP has also
      shown interest in using this functionality.[1]
      
      In addition we change get_user_pages() to use the new FOLL_LONGTERM flag
      and remove the specialized get_user_pages_longterm call.
      
      [1] https://lkml.org/lkml/2019/3/19/939
      
      "longterm" is a relative thing and at this point is probably a misnomer.
      This is really flagging a pin which is going to be given to hardware and
      can't move.  I've thought of a couple of alternative names but I think we
      have to settle on if we are going to use FL_LAYOUT or something else to
      solve the "longterm" problem.  Then I think we can change the flag to a
      better name.
      
      Secondly, it depends on how often you are registering memory.  I have
      spoken with some RDMA users who consider MR in the performance path...
      For the overall application performance.  I don't have the numbers as the
      tests for HFI1 were done a long time ago.  But there was a significant
      advantage.  Some of which is probably due to the fact that you don't have
      to hold mmap_sem.
      
      Finally, architecturally I think it would be good for everyone to use
      *_fast.  There are patches submitted to the RDMA list which would allow
      the use of *_fast (they reworking the use of mmap_sem) and as soon as they
      are accepted I'll submit a patch to convert the RDMA core as well.  Also
      to this point others are looking to use *_fast.
      
      As an aside, Jasons pointed out in my previous submission that *_fast and
      *_unlocked look very much the same.  I agree and I think further cleanup
      will be coming.  But I'm focused on getting the final solution for DAX at
      the moment.
      
      This patch (of 7):
      
      This patch starts a series which aims to support FOLL_LONGTERM in
      get_user_pages_fast().  Some callers who would like to do a longterm (user
      controlled pin) of pages with the fast variant of GUP for performance
      purposes.
      
      Rather than have a separate get_user_pages_longterm() call, introduce
      FOLL_LONGTERM and change the longterm callers to use it.
      
      This patch does not change any functionality.  In the short term
      "longterm" or user controlled pins are unsafe for Filesystems and FS DAX
      in particular has been blocked.  However, callers of get_user_pages_fast()
      were not "protected".
      
      FOLL_LONGTERM can _only_ be supported with get_user_pages[_fast]() as it
      requires vmas to determine if DAX is in use.
      
      NOTE: In merging with the CMA changes we opt to change the
      get_user_pages() call in check_and_migrate_cma_pages() to a call of
      __get_user_pages_locked() on the newly migrated pages.  This makes the
      code read better in that we are calling __get_user_pages_locked() on the
      pages before and after a potential migration.
      
      As a side affect some of the interfaces are cleaned up but this is not the
      primary purpose of the series.
      
      In review[1] it was asked:
      
      <quote>
      > This I don't get - if you do lock down long term mappings performance
      > of the actual get_user_pages call shouldn't matter to start with.
      >
      > What do I miss?
      
      A couple of points.
      
      First "longterm" is a relative thing and at this point is probably a
      misnomer.  This is really flagging a pin which is going to be given to
      hardware and can't move.  I've thought of a couple of alternative names
      but I think we have to settle on if we are going to use FL_LAYOUT or
      something else to solve the "longterm" problem.  Then I think we can
      change the flag to a better name.
      
      Second, It depends on how often you are registering memory.  I have spoken
      with some RDMA users who consider MR in the performance path...  For the
      overall application performance.  I don't have the numbers as the tests
      for HFI1 were done a long time ago.  But there was a significant
      advantage.  Some of which is probably due to the fact that you don't have
      to hold mmap_sem.
      
      Finally, architecturally I think it would be good for everyone to use
      *_fast.  There are patches submitted to the RDMA list which would allow
      the use of *_fast (they reworking the use of mmap_sem) and as soon as they
      are accepted I'll submit a patch to convert the RDMA core as well.  Also
      to this point others are looking to use *_fast.
      
      As an asside, Jasons pointed out in my previous submission that *_fast and
      *_unlocked look very much the same.  I agree and I think further cleanup
      will be coming.  But I'm focused on getting the final solution for DAX at
      the moment.
      
      </quote>
      
      [1] https://lore.kernel.org/lkml/20190220180255.GA12020@iweiny-DESK2.sc.intel.com/T/#md6abad2569f3bf6c1f03686c8097ab6563e94965
      
      [ira.weiny@intel.com: v3]
        Link: http://lkml.kernel.org/r/20190328084422.29911-2-ira.weiny@intel.com
      Link: http://lkml.kernel.org/r/20190328084422.29911-2-ira.weiny@intel.com
      Link: http://lkml.kernel.org/r/20190317183438.2057-2-ira.weiny@intel.comSigned-off-by: NIra Weiny <ira.weiny@intel.com>
      Reviewed-by: NAndrew Morton <akpm@linux-foundation.org>
      Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
      Cc: Michal Hocko <mhocko@kernel.org>
      Cc: John Hubbard <jhubbard@nvidia.com>
      Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Jason Gunthorpe <jgg@ziepe.ca>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Rich Felker <dalias@libc.org>
      Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: James Hogan <jhogan@kernel.org>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: Mike Marshall <hubcap@omnibond.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      932f4a63
  2. 14 5月, 2019 1 次提交
  3. 06 5月, 2019 5 次提交
  4. 03 5月, 2019 1 次提交
    • R
      powerpc/mm: Warn if W+X pages found on boot · 453d87f6
      Russell Currey 提交于
      Implement code to walk all pages and warn if any are found to be both
      writable and executable.  Depends on STRICT_KERNEL_RWX enabled, and is
      behind the DEBUG_WX config option.
      
      This only runs on boot and has no runtime performance implications.
      
      Very heavily influenced (and in some cases copied verbatim) from the
      ARM64 code written by Laura Abbott (thanks!), since our ptdump
      infrastructure is similar.
      Signed-off-by: NRussell Currey <ruscur@russell.cc>
      [mpe: Fixup build error when disabled]
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      453d87f6
  5. 02 5月, 2019 26 次提交