1. 05 6月, 2020 5 次提交
    • I
      arch/kmap: ensure kmap_prot visibility · db458d73
      Ira Weiny 提交于
      We want to support kmap_atomic_prot() on all architectures and it makes
      sense to define kmap_atomic() to use the default kmap_prot.
      
      So we ensure all arch's have a globally available kmap_prot either as a
      define or exported symbol.
      Signed-off-by: NIra Weiny <ira.weiny@intel.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Christian König <christian.koenig@amd.com>
      Cc: Chris Zankel <chris@zankel.net>
      Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Helge Deller <deller@gmx.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: "James E.J. Bottomley" <James.Bottomley@HansenPartnership.com>
      Cc: Max Filippov <jcmvbkbc@gmail.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/20200507150004.1423069-9-ira.weiny@intel.comSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      db458d73
    • I
      arch/kunmap_atomic: consolidate duplicate code · abca2500
      Ira Weiny 提交于
      Every single architecture (including !CONFIG_HIGHMEM) calls...
      
      	pagefault_enable();
      	preempt_enable();
      
      ... before returning from __kunmap_atomic().  Lift this code into the
      kunmap_atomic() macro.
      
      While we are at it rename __kunmap_atomic() to kunmap_atomic_high() to
      be consistent.
      
      [ira.weiny@intel.com: don't enable pagefault/preempt twice]
        Link: http://lkml.kernel.org/r/20200518184843.3029640-1-ira.weiny@intel.com
      [akpm@linux-foundation.org: coding style fixes]
      Signed-off-by: NIra Weiny <ira.weiny@intel.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Christian König <christian.koenig@amd.com>
      Cc: Chris Zankel <chris@zankel.net>
      Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Helge Deller <deller@gmx.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: "James E.J. Bottomley" <James.Bottomley@HansenPartnership.com>
      Cc: Max Filippov <jcmvbkbc@gmail.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Guenter Roeck <linux@roeck-us.net>
      Link: http://lkml.kernel.org/r/20200507150004.1423069-8-ira.weiny@intel.comSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      abca2500
    • I
      arch/kmap_atomic: consolidate duplicate code · 78b6d91e
      Ira Weiny 提交于
      Every arch has the same code to ensure atomic operations and a check for
      !HIGHMEM page.
      
      Remove the duplicate code by defining a core kmap_atomic() which only
      calls the arch specific kmap_atomic_high() when the page is high memory.
      
      [akpm@linux-foundation.org: coding style fixes]
      Signed-off-by: NIra Weiny <ira.weiny@intel.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Christian König <christian.koenig@amd.com>
      Cc: Chris Zankel <chris@zankel.net>
      Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Helge Deller <deller@gmx.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: "James E.J. Bottomley" <James.Bottomley@HansenPartnership.com>
      Cc: Max Filippov <jcmvbkbc@gmail.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/20200507150004.1423069-7-ira.weiny@intel.comSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      78b6d91e
    • I
      {x86,powerpc,microblaze}/kmap: move preempt disable · ee9bc5fd
      Ira Weiny 提交于
      During this kmap() conversion series we must maintain bisect-ability.  To
      do this, kmap_atomic_prot() in x86, powerpc, and microblaze need to remain
      functional.
      
      Create a temporary inline version of kmap_atomic_prot within these
      architectures so we can rework their kmap_atomic() calls and then lift
      kmap_atomic_prot() to the core.
      Suggested-by: NAl Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: NIra Weiny <ira.weiny@intel.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Christian König <christian.koenig@amd.com>
      Cc: Chris Zankel <chris@zankel.net>
      Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Helge Deller <deller@gmx.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: "James E.J. Bottomley" <James.Bottomley@HansenPartnership.com>
      Cc: Max Filippov <jcmvbkbc@gmail.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/20200507150004.1423069-6-ira.weiny@intel.comSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      ee9bc5fd
    • M
      powerpc: add support for folded p4d page tables · 2fb47060
      Mike Rapoport 提交于
      Implement primitives necessary for the 4th level folding, add walks of p4d
      level where appropriate and replace 5level-fixup.h with pgtable-nop4d.h.
      
      [rppt@linux.ibm.com: powerpc/xmon: drop unused pgdir varialble in show_pte() function]
        Link: http://lkml.kernel.org/r/20200519181454.GI1059226@linux.ibm.com
      [rppt@linux.ibm.com; build fix]
        Link: http://lkml.kernel.org/r/20200423141845.GI13521@linux.ibm.comSigned-off-by: NMike Rapoport <rppt@linux.ibm.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Tested-by: Christophe Leroy <christophe.leroy@c-s.fr> # 8xx and 83xx
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Brian Cain <bcain@codeaurora.org>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: Geert Uytterhoeven <geert+renesas@glider.be>
      Cc: Guan Xuetao <gxt@pku.edu.cn>
      Cc: James Morse <james.morse@arm.com>
      Cc: Jonas Bonn <jonas@southpole.se>
      Cc: Julien Thierry <julien.thierry.kdev@gmail.com>
      Cc: Ley Foon Tan <ley.foon.tan@intel.com>
      Cc: Marc Zyngier <maz@kernel.org>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Rich Felker <dalias@libc.org>
      Cc: Russell King <linux@armlinux.org.uk>
      Cc: Stafford Horne <shorne@gmail.com>
      Cc: Stefan Kristiansson <stefan.kristiansson@saunalahti.fi>
      Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Will Deacon <will@kernel.org>
      Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
      Link: http://lkml.kernel.org/r/20200414153455.21744-9-rppt@kernel.orgSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      2fb47060
  2. 04 6月, 2020 4 次提交
    • M
      hugetlbfs: remove hugetlb_add_hstate() warning for existing hstate · 38237830
      Mike Kravetz 提交于
      hugetlb_add_hstate() prints a warning if the hstate already exists.  This
      was originally done as part of kernel command line parsing.  If
      'hugepagesz=' was specified more than once, the warning
      
      	pr_warn("hugepagesz= specified twice, ignoring\n");
      
      would be printed.
      
      Some architectures want to enable all huge page sizes.  They would call
      hugetlb_add_hstate for all supported sizes.  However, this was done after
      command line processing and as a result hstates could have already been
      created for some sizes.  To make sure no warning were printed, there would
      often be code like:
      
      	if (!size_to_hstate(size)
      		hugetlb_add_hstate(ilog2(size) - PAGE_SHIFT)
      
      The only time we want to print the warning is as the result of command
      line processing.  So, remove the warning from hugetlb_add_hstate and add
      it to the single arch independent routine processing "hugepagesz=".  After
      this, calls to size_to_hstate() in arch specific code can be removed and
      hugetlb_add_hstate can be called without worrying about warning messages.
      
      [mike.kravetz@oracle.com: fix hugetlb initialization]
        Link: http://lkml.kernel.org/r/4c36c6ce-3774-78fa-abc4-b7346bf24348@oracle.com
        Link: http://lkml.kernel.org/r/20200428205614.246260-5-mike.kravetz@oracle.comSigned-off-by: NMike Kravetz <mike.kravetz@oracle.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Tested-by: NAnders Roxell <anders.roxell@linaro.org>
      Acked-by: NMina Almasry <almasrymina@google.com>
      Acked-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>	[s390]
      Acked-by: NWill Deacon <will@kernel.org>
      Cc: Albert Ou <aou@eecs.berkeley.edu>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Christian Borntraeger <borntraeger@de.ibm.com>
      Cc: Christophe Leroy <christophe.leroy@c-s.fr>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: Longpeng <longpeng2@huawei.com>
      Cc: Nitesh Narayan Lal <nitesh@redhat.com>
      Cc: Palmer Dabbelt <palmer@dabbelt.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Paul Walmsley <paul.walmsley@sifive.com>
      Cc: Peter Xu <peterx@redhat.com>
      Cc: Randy Dunlap <rdunlap@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vasily Gorbik <gor@linux.ibm.com>
      Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com>
      Cc: Qian Cai <cai@lca.pw>
      Cc: Stephen Rothwell <sfr@canb.auug.org.au>
      Link: http://lkml.kernel.org/r/20200417185049.275845-4-mike.kravetz@oracle.com
      Link: http://lkml.kernel.org/r/20200428205614.246260-4-mike.kravetz@oracle.comSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      38237830
    • M
      hugetlbfs: move hugepagesz= parsing to arch independent code · 359f2544
      Mike Kravetz 提交于
      Now that architectures provide arch_hugetlb_valid_size(), parsing of
      "hugepagesz=" can be done in architecture independent code.  Create a
      single routine to handle hugepagesz= parsing and remove all arch specific
      routines.  We can also remove the interface hugetlb_bad_size() as this is
      no longer used outside arch independent code.
      
      This also provides consistent behavior of hugetlbfs command line options.
      The hugepagesz= option should only be specified once for a specific size,
      but some architectures allow multiple instances.  This appears to be more
      of an oversight when code was added by some architectures to set up ALL
      huge pages sizes.
      Signed-off-by: NMike Kravetz <mike.kravetz@oracle.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Tested-by: NSandipan Das <sandipan@linux.ibm.com>
      Reviewed-by: NPeter Xu <peterx@redhat.com>
      Acked-by: NMina Almasry <almasrymina@google.com>
      Acked-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>	[s390]
      Acked-by: NWill Deacon <will@kernel.org>
      Cc: Albert Ou <aou@eecs.berkeley.edu>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Christian Borntraeger <borntraeger@de.ibm.com>
      Cc: Christophe Leroy <christophe.leroy@c-s.fr>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: Longpeng <longpeng2@huawei.com>
      Cc: Nitesh Narayan Lal <nitesh@redhat.com>
      Cc: Palmer Dabbelt <palmer@dabbelt.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Paul Walmsley <paul.walmsley@sifive.com>
      Cc: Randy Dunlap <rdunlap@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vasily Gorbik <gor@linux.ibm.com>
      Cc: Anders Roxell <anders.roxell@linaro.org>
      Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com>
      Cc: Qian Cai <cai@lca.pw>
      Cc: Stephen Rothwell <sfr@canb.auug.org.au>
      Link: http://lkml.kernel.org/r/20200417185049.275845-3-mike.kravetz@oracle.com
      Link: http://lkml.kernel.org/r/20200428205614.246260-3-mike.kravetz@oracle.comSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      359f2544
    • M
      hugetlbfs: add arch_hugetlb_valid_size · ae94da89
      Mike Kravetz 提交于
      Patch series "Clean up hugetlb boot command line processing", v4.
      
      Longpeng(Mike) reported a weird message from hugetlb command line
      processing and proposed a solution [1].  While the proposed patch does
      address the specific issue, there are other related issues in command line
      processing.  As hugetlbfs evolved, updates to command line processing have
      been made to meet immediate needs and not necessarily in a coordinated
      manner.  The result is that some processing is done in arch specific code,
      some is done in arch independent code and coordination is problematic.
      Semantics can vary between architectures.
      
      The patch series does the following:
      - Define arch specific arch_hugetlb_valid_size routine used to validate
        passed huge page sizes.
      - Move hugepagesz= command line parsing out of arch specific code and into
        an arch independent routine.
      - Clean up command line processing to follow desired semantics and
        document those semantics.
      
      [1] https://lore.kernel.org/linux-mm/20200305033014.1152-1-longpeng2@huawei.com
      
      This patch (of 3):
      
      The architecture independent routine hugetlb_default_setup sets up the
      default huge pages size.  It has no way to verify if the passed value is
      valid, so it accepts it and attempts to validate at a later time.  This
      requires undocumented cooperation between the arch specific and arch
      independent code.
      
      For architectures that support more than one huge page size, provide a
      routine arch_hugetlb_valid_size to validate a huge page size.
      hugetlb_default_setup can use this to validate passed values.
      
      arch_hugetlb_valid_size will also be used in a subsequent patch to move
      processing of the "hugepagesz=" in arch specific code to a common routine
      in arch independent code.
      Signed-off-by: NMike Kravetz <mike.kravetz@oracle.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Acked-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>	[s390]
      Acked-by: NWill Deacon <will@kernel.org>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Paul Walmsley <paul.walmsley@sifive.com>
      Cc: Palmer Dabbelt <palmer@dabbelt.com>
      Cc: Albert Ou <aou@eecs.berkeley.edu>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Vasily Gorbik <gor@linux.ibm.com>
      Cc: Christian Borntraeger <borntraeger@de.ibm.com>
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: Longpeng <longpeng2@huawei.com>
      Cc: Christophe Leroy <christophe.leroy@c-s.fr>
      Cc: Randy Dunlap <rdunlap@infradead.org>
      Cc: Mina Almasry <almasrymina@google.com>
      Cc: Peter Xu <peterx@redhat.com>
      Cc: Nitesh Narayan Lal <nitesh@redhat.com>
      Cc: Anders Roxell <anders.roxell@linaro.org>
      Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com>
      Cc: Qian Cai <cai@lca.pw>
      Cc: Stephen Rothwell <sfr@canb.auug.org.au>
      Link: http://lkml.kernel.org/r/20200428205614.246260-1-mike.kravetz@oracle.com
      Link: http://lkml.kernel.org/r/20200428205614.246260-2-mike.kravetz@oracle.com
      Link: http://lkml.kernel.org/r/20200417185049.275845-1-mike.kravetz@oracle.com
      Link: http://lkml.kernel.org/r/20200417185049.275845-2-mike.kravetz@oracle.comSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      ae94da89
    • M
      mm: use free_area_init() instead of free_area_init_nodes() · 9691a071
      Mike Rapoport 提交于
      free_area_init() has effectively became a wrapper for
      free_area_init_nodes() and there is no point of keeping it.  Still
      free_area_init() name is shorter and more general as it does not imply
      necessity to initialize multiple nodes.
      
      Rename free_area_init_nodes() to free_area_init(), update the callers and
      drop old version of free_area_init().
      Signed-off-by: NMike Rapoport <rppt@linux.ibm.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Tested-by: Hoan Tran <hoan@os.amperecomputing.com>	[arm64]
      Reviewed-by: NBaoquan He <bhe@redhat.com>
      Acked-by: NCatalin Marinas <catalin.marinas@arm.com>
      Cc: Brian Cain <bcain@codeaurora.org>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: Greentime Hu <green.hu@gmail.com>
      Cc: Greg Ungerer <gerg@linux-m68k.org>
      Cc: Guan Xuetao <gxt@pku.edu.cn>
      Cc: Guo Ren <guoren@kernel.org>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Helge Deller <deller@gmx.de>
      Cc: "James E.J. Bottomley" <James.Bottomley@HansenPartnership.com>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: Ley Foon Tan <ley.foon.tan@intel.com>
      Cc: Mark Salter <msalter@redhat.com>
      Cc: Matt Turner <mattst88@gmail.com>
      Cc: Max Filippov <jcmvbkbc@gmail.com>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Michal Hocko <mhocko@kernel.org>
      Cc: Michal Simek <monstr@monstr.eu>
      Cc: Nick Hu <nickhu@andestech.com>
      Cc: Paul Walmsley <paul.walmsley@sifive.com>
      Cc: Richard Weinberger <richard@nod.at>
      Cc: Rich Felker <dalias@libc.org>
      Cc: Russell King <linux@armlinux.org.uk>
      Cc: Stafford Horne <shorne@gmail.com>
      Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Vineet Gupta <vgupta@synopsys.com>
      Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
      Link: http://lkml.kernel.org/r/20200412194859.12663-6-rppt@kernel.orgSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      9691a071
  3. 03 6月, 2020 1 次提交
    • C
      powerpc: remove __ioremap_at and __iounmap_at · 91f03f29
      Christoph Hellwig 提交于
      These helpers are only used for remapping the ISA I/O base.  Replace the
      mapping side with a remap_isa_range helper in isa-bridge.c that hard codes
      all the known arguments, and just remove __iounmap_at in favour of open
      coding it in the only caller.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Acked-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Christian Borntraeger <borntraeger@de.ibm.com>
      Cc: Christophe Leroy <christophe.leroy@c-s.fr>
      Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
      Cc: David Airlie <airlied@linux.ie>
      Cc: Gao Xiang <xiang@kernel.org>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Haiyang Zhang <haiyangz@microsoft.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: "K. Y. Srinivasan" <kys@microsoft.com>
      Cc: Laura Abbott <labbott@redhat.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Michael Kelley <mikelley@microsoft.com>
      Cc: Minchan Kim <minchan@kernel.org>
      Cc: Nitin Gupta <ngupta@vflare.org>
      Cc: Robin Murphy <robin.murphy@arm.com>
      Cc: Sakari Ailus <sakari.ailus@linux.intel.com>
      Cc: Stephen Hemminger <sthemmin@microsoft.com>
      Cc: Sumit Semwal <sumit.semwal@linaro.org>
      Cc: Wei Liu <wei.liu@kernel.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Paul Mackerras <paulus@ozlabs.org>
      Cc: Vasily Gorbik <gor@linux.ibm.com>
      Cc: Will Deacon <will@kernel.org>
      Link: http://lkml.kernel.org/r/20200414131348.444715-8-hch@lst.deSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      91f03f29
  4. 20 5月, 2020 1 次提交
  5. 22 4月, 2020 1 次提交
  6. 11 4月, 2020 4 次提交
  7. 08 4月, 2020 1 次提交
  8. 03 4月, 2020 4 次提交
    • P
      mm: allow VM_FAULT_RETRY for multiple times · 4064b982
      Peter Xu 提交于
      The idea comes from a discussion between Linus and Andrea [1].
      
      Before this patch we only allow a page fault to retry once.  We achieved
      this by clearing the FAULT_FLAG_ALLOW_RETRY flag when doing
      handle_mm_fault() the second time.  This was majorly used to avoid
      unexpected starvation of the system by looping over forever to handle the
      page fault on a single page.  However that should hardly happen, and after
      all for each code path to return a VM_FAULT_RETRY we'll first wait for a
      condition (during which time we should possibly yield the cpu) to happen
      before VM_FAULT_RETRY is really returned.
      
      This patch removes the restriction by keeping the FAULT_FLAG_ALLOW_RETRY
      flag when we receive VM_FAULT_RETRY.  It means that the page fault handler
      now can retry the page fault for multiple times if necessary without the
      need to generate another page fault event.  Meanwhile we still keep the
      FAULT_FLAG_TRIED flag so page fault handler can still identify whether a
      page fault is the first attempt or not.
      
      Then we'll have these combinations of fault flags (only considering
      ALLOW_RETRY flag and TRIED flag):
      
        - ALLOW_RETRY and !TRIED:  this means the page fault allows to
                                   retry, and this is the first try
      
        - ALLOW_RETRY and TRIED:   this means the page fault allows to
                                   retry, and this is not the first try
      
        - !ALLOW_RETRY and !TRIED: this means the page fault does not allow
                                   to retry at all
      
        - !ALLOW_RETRY and TRIED:  this is forbidden and should never be used
      
      In existing code we have multiple places that has taken special care of
      the first condition above by checking against (fault_flags &
      FAULT_FLAG_ALLOW_RETRY).  This patch introduces a simple helper to detect
      the first retry of a page fault by checking against both (fault_flags &
      FAULT_FLAG_ALLOW_RETRY) and !(fault_flag & FAULT_FLAG_TRIED) because now
      even the 2nd try will have the ALLOW_RETRY set, then use that helper in
      all existing special paths.  One example is in __lock_page_or_retry(), now
      we'll drop the mmap_sem only in the first attempt of page fault and we'll
      keep it in follow up retries, so old locking behavior will be retained.
      
      This will be a nice enhancement for current code [2] at the same time a
      supporting material for the future userfaultfd-writeprotect work, since in
      that work there will always be an explicit userfault writeprotect retry
      for protected pages, and if that cannot resolve the page fault (e.g., when
      userfaultfd-writeprotect is used in conjunction with swapped pages) then
      we'll possibly need a 3rd retry of the page fault.  It might also benefit
      other potential users who will have similar requirement like userfault
      write-protection.
      
      GUP code is not touched yet and will be covered in follow up patch.
      
      Please read the thread below for more information.
      
      [1] https://lore.kernel.org/lkml/20171102193644.GB22686@redhat.com/
      [2] https://lore.kernel.org/lkml/20181230154648.GB9832@redhat.com/Suggested-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Suggested-by: NAndrea Arcangeli <aarcange@redhat.com>
      Signed-off-by: NPeter Xu <peterx@redhat.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Tested-by: NBrian Geffon <bgeffon@google.com>
      Cc: Bobby Powers <bobbypowers@gmail.com>
      Cc: David Hildenbrand <david@redhat.com>
      Cc: Denis Plotnikov <dplotnikov@virtuozzo.com>
      Cc: "Dr . David Alan Gilbert" <dgilbert@redhat.com>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: Jerome Glisse <jglisse@redhat.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: "Kirill A . Shutemov" <kirill@shutemov.name>
      Cc: Martin Cracauer <cracauer@cons.org>
      Cc: Marty McFadden <mcfadden8@llnl.gov>
      Cc: Matthew Wilcox <willy@infradead.org>
      Cc: Maya Gokhale <gokhale2@llnl.gov>
      Cc: Mel Gorman <mgorman@suse.de>
      Cc: Mike Kravetz <mike.kravetz@oracle.com>
      Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>
      Cc: Pavel Emelyanov <xemul@openvz.org>
      Link: http://lkml.kernel.org/r/20200220160246.9790-1-peterx@redhat.comSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      4064b982
    • P
      mm: introduce FAULT_FLAG_DEFAULT · dde16072
      Peter Xu 提交于
      Although there're tons of arch-specific page fault handlers, most of them
      are still sharing the same initial value of the page fault flags.  Say,
      merely all of the page fault handlers would allow the fault to be retried,
      and they also allow the fault to respond to SIGKILL.
      
      Let's define a default value for the fault flags to replace those initial
      page fault flags that were copied over.  With this, it'll be far easier to
      introduce new fault flag that can be used by all the architectures instead
      of touching all the archs.
      Signed-off-by: NPeter Xu <peterx@redhat.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Tested-by: NBrian Geffon <bgeffon@google.com>
      Reviewed-by: NDavid Hildenbrand <david@redhat.com>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: Bobby Powers <bobbypowers@gmail.com>
      Cc: Denis Plotnikov <dplotnikov@virtuozzo.com>
      Cc: "Dr . David Alan Gilbert" <dgilbert@redhat.com>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: Jerome Glisse <jglisse@redhat.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: "Kirill A . Shutemov" <kirill@shutemov.name>
      Cc: Martin Cracauer <cracauer@cons.org>
      Cc: Marty McFadden <mcfadden8@llnl.gov>
      Cc: Matthew Wilcox <willy@infradead.org>
      Cc: Maya Gokhale <gokhale2@llnl.gov>
      Cc: Mel Gorman <mgorman@suse.de>
      Cc: Mike Kravetz <mike.kravetz@oracle.com>
      Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>
      Cc: Pavel Emelyanov <xemul@openvz.org>
      Link: http://lkml.kernel.org/r/20200220160238.9694-1-peterx@redhat.comSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      dde16072
    • P
      powerpc/mm: use helper fault_signal_pending() · c9a0dad1
      Peter Xu 提交于
      Let powerpc code to use the new helper, by moving the signal handling
      earlier before the retry logic.
      Signed-off-by: NPeter Xu <peterx@redhat.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Tested-by: NBrian Geffon <bgeffon@google.com>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: Bobby Powers <bobbypowers@gmail.com>
      Cc: David Hildenbrand <david@redhat.com>
      Cc: Denis Plotnikov <dplotnikov@virtuozzo.com>
      Cc: "Dr . David Alan Gilbert" <dgilbert@redhat.com>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: Jerome Glisse <jglisse@redhat.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: "Kirill A . Shutemov" <kirill@shutemov.name>
      Cc: Martin Cracauer <cracauer@cons.org>
      Cc: Marty McFadden <mcfadden8@llnl.gov>
      Cc: Matthew Wilcox <willy@infradead.org>
      Cc: Maya Gokhale <gokhale2@llnl.gov>
      Cc: Mel Gorman <mgorman@suse.de>
      Cc: Mike Kravetz <mike.kravetz@oracle.com>
      Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>
      Cc: Pavel Emelyanov <xemul@openvz.org>
      Link: http://lkml.kernel.org/r/20200220160222.9422-1-peterx@redhat.comSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      c9a0dad1
    • A
      mm/vma: make vma_is_foreign() available for general use · 7969f226
      Anshuman Khandual 提交于
      Idea of a foreign VMA with respect to the present context is very generic.
      But currently there are two identical definitions for this in powerpc and
      x86 platforms.  Lets consolidate those redundant definitions while making
      vma_is_foreign() available for general use later.  This should not cause
      any functional change.
      Signed-off-by: NAnshuman Khandual <anshuman.khandual@arm.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Acked-by: NVlastimil Babka <vbabka@suse.cz>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@redhat.com>
      Link: http://lkml.kernel.org/r/1582782965-3274-3-git-send-email-anshuman.khandual@arm.comSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      7969f226
  9. 25 3月, 2020 2 次提交
  10. 17 3月, 2020 2 次提交
  11. 13 3月, 2020 1 次提交
  12. 05 3月, 2020 1 次提交
    • M
      powerpc/mm: Fix missing KUAP disable in flush_coherent_icache() · 59bee45b
      Michael Ellerman 提交于
      Stefan reported a strange kernel fault which turned out to be due to a
      missing KUAP disable in flush_coherent_icache() called from
      flush_icache_range().
      
      The fault looks like:
      
        Kernel attempted to access user page (7fffc30d9c00) - exploit attempt? (uid: 1009)
        BUG: Unable to handle kernel data access on read at 0x7fffc30d9c00
        Faulting instruction address: 0xc00000000007232c
        Oops: Kernel access of bad area, sig: 11 [#1]
        LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=2048 NUMA PowerNV
        CPU: 35 PID: 5886 Comm: sigtramp Not tainted 5.6.0-rc2-gcc-8.2.0-00003-gfc37a163 #79
        NIP:  c00000000007232c LR: c00000000003b7fc CTR: 0000000000000000
        REGS: c000001e11093940 TRAP: 0300   Not tainted  (5.6.0-rc2-gcc-8.2.0-00003-gfc37a163)
        MSR:  900000000280b033 <SF,HV,VEC,VSX,EE,FP,ME,IR,DR,RI,LE>  CR: 28000884  XER: 00000000
        CFAR: c0000000000722fc DAR: 00007fffc30d9c00 DSISR: 08000000 IRQMASK: 0
        GPR00: c00000000003b7fc c000001e11093bd0 c0000000023ac200 00007fffc30d9c00
        GPR04: 00007fffc30d9c18 0000000000000000 c000001e11093bd4 0000000000000000
        GPR08: 0000000000000000 0000000000000001 0000000000000000 c000001e1104ed80
        GPR12: 0000000000000000 c000001fff6ab380 c0000000016be2d0 4000000000000000
        GPR16: c000000000000000 bfffffffffffffff 0000000000000000 0000000000000000
        GPR20: 00007fffc30d9c00 00007fffc30d8f58 00007fffc30d9c18 00007fffc30d9c20
        GPR24: 00007fffc30d9c18 0000000000000000 c000001e11093d90 c000001e1104ed80
        GPR28: c000001e11093e90 0000000000000000 c0000000023d9d18 00007fffc30d9c00
        NIP flush_icache_range+0x5c/0x80
        LR  handle_rt_signal64+0x95c/0xc2c
        Call Trace:
          0xc000001e11093d90 (unreliable)
          handle_rt_signal64+0x93c/0xc2c
          do_notify_resume+0x310/0x430
          ret_from_except_lite+0x70/0x74
        Instruction dump:
        409e002c 7c0802a6 3c62ff31 3863f6a0 f8010080 48195fed 60000000 48fe4c8d
        60000000 e8010080 7c0803a6 7c0004ac <7c00ffac> 7c0004ac 4c00012c 38210070
      
      This path through handle_rt_signal64() to setup_trampoline() and
      flush_icache_range() is only triggered by 64-bit processes that have
      unmapped their VDSO, which is rare.
      
      flush_icache_range() takes a range of addresses to flush. In
      flush_coherent_icache() we implement an optimisation for CPUs where we
      know we don't actually have to flush the whole range, we just need to
      do a single icbi.
      
      However we still execute the icbi on the user address of the start of
      the range we're flushing. On CPUs that also implement KUAP (Power9)
      that leads to the spurious fault above.
      
      We should be able to pass any address, including a kernel address, to
      the icbi on these CPUs, which would avoid any interaction with KUAP.
      But I don't want to make that change in a bug fix, just in case it
      surfaces some strange behaviour on some CPU.
      
      So for now just disable KUAP around the icbi. Note the icbi is treated
      as a load, so we allow read access, not write as you'd expect.
      
      Fixes: 890274c2 ("powerpc/64s: Implement KUAP for Radix MMU")
      Cc: stable@vger.kernel.org # v5.2+
      Reported-by: NStefan Berger <stefanb@linux.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      Link: https://lore.kernel.org/r/20200303235708.26004-1-mpe@ellerman.id.au
      59bee45b
  13. 04 3月, 2020 7 次提交
  14. 26 2月, 2020 2 次提交
  15. 21 2月, 2020 1 次提交
    • D
      mm/memremap_pages: Introduce memremap_compat_align() · 9ffc1d19
      Dan Williams 提交于
      The "sub-section memory hotplug" facility allows memremap_pages() users
      like libnvdimm to compensate for hardware platforms like x86 that have a
      section size larger than their hardware memory mapping granularity.  The
      compensation that sub-section support affords is being tolerant of
      physical memory resources shifting by units smaller (64MiB on x86) than
      the memory-hotplug section size (128 MiB). Where the platform
      physical-memory mapping granularity is limited by the number and
      capability of address-decode-registers in the memory controller.
      
      While the sub-section support allows memremap_pages() to operate on
      sub-section (2MiB) granularity, the Power architecture may still
      require 16MiB alignment on "!radix_enabled()" platforms.
      
      In order for libnvdimm to be able to detect and manage this per-arch
      limitation, introduce memremap_compat_align() as a common minimum
      alignment across all driver-facing memory-mapping interfaces, and let
      Power override it to 16MiB in the "!radix_enabled()" case.
      
      The assumption / requirement for 16MiB to be a viable
      memremap_compat_align() value is that Power does not have platforms
      where its equivalent of address-decode-registers never hardware remaps a
      persistent memory resource on smaller than 16MiB boundaries. Note that I
      tried my best to not add a new Kconfig symbol, but header include
      entanglements defeated the #ifndef memremap_compat_align design pattern
      and the need to export it defeats the __weak design pattern for arch
      overrides.
      
      Based on an initial patch by Aneesh.
      
      Link: http://lore.kernel.org/r/CAPcyv4gBGNP95APYaBcsocEa50tQj9b5h__83vgngjq3ouGX_Q@mail.gmail.comReported-by: NAneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
      Reported-by: NJeff Moyer <jmoyer@redhat.com>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Reviewed-by: NAneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
      Acked-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc)
      Signed-off-by: NDan Williams <dan.j.williams@intel.com>
      9ffc1d19
  16. 19 2月, 2020 2 次提交
  17. 18 2月, 2020 1 次提交
    • C
      powerpc/32s: Fix DSI and ISI exceptions for CONFIG_VMAP_STACK · 232ca1ee
      Christophe Leroy 提交于
      hash_page() needs to read page tables from kernel memory. When entire
      kernel memory is mapped by BATs, which is normally the case when
      CONFIG_STRICT_KERNEL_RWX is not set, it works even if the page hosting
      the page table is not referenced in the MMU hash table.
      
      However, if the page where the page table resides is not covered by
      a BAT, a DSI fault can be encountered from hash_page(), and it loops
      forever. This can happen when CONFIG_STRICT_KERNEL_RWX is selected
      and the alignment of the different regions is too small to allow
      covering the entire memory with BATs. This also happens when
      CONFIG_DEBUG_PAGEALLOC is selected or when booting with 'nobats'
      flag.
      
      Also, if the page containing the kernel stack is not present in the
      MMU hash table, registers cannot be saved and a recursive DSI fault
      is encountered.
      
      To allow hash_page() to properly do its job at all time and load the
      MMU hash table whenever needed, it must run with data MMU disabled.
      This means it must be called before re-enabling data MMU. To allow
      this, registers clobbered by hash_page() and create_hpte() have to
      be saved in the thread struct together with SRR0, SSR1, DAR and DSISR.
      It is also necessary to ensure that DSI prolog doesn't overwrite
      regs saved by prolog of the current running exception. That means:
      - DSI can only use SPRN_SPRG_SCRATCH0
      - Exceptions must free SPRN_SPRG_SCRATCH0 before writing to the stack.
      
      This also fixes the Oops reported by Erhard when create_hpte() is
      called by add_hash_page().
      
      Due to prolog size increase, a few more exceptions had to get split
      in two parts.
      
      Fixes: cd08f109 ("powerpc/32s: Enable CONFIG_VMAP_STACK")
      Reported-by: NErhard F. <erhard_f@mailbox.org>
      Signed-off-by: NChristophe Leroy <christophe.leroy@c-s.fr>
      Tested-by: NErhard F. <erhard_f@mailbox.org>
      Tested-by: NLarry Finger <Larry.Finger@lwfinger.net>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      Link: https://bugzilla.kernel.org/show_bug.cgi?id=206501
      Link: https://lore.kernel.org/r/64a4aa44686e9fd4b01333401367029771d9b231.1581761633.git.christophe.leroy@c-s.fr
      232ca1ee