1. 07 5月, 2021 1 次提交
  2. 06 5月, 2021 1 次提交
    • P
      mm/gup: do not migrate zero page · 9afaf30f
      Pavel Tatashin 提交于
      On some platforms ZERO_PAGE(0) might end-up in a movable zone.  Do not
      migrate zero page in gup during longterm pinning as migration of zero page
      is not allowed.
      
      For example, in x86 QEMU with 16G of memory and kernelcore=5G parameter, I
      see the following:
      
      Boot#1: zero_pfn  0x48a8d zero_pfn zone: ZONE_DMA32
      Boot#2: zero_pfn 0x20168d zero_pfn zone: ZONE_MOVABLE
      
      On x86, empty_zero_page is declared in .bss and depending on the loader
      may end up in different physical locations during boots.
      
      Also, move is_zero_pfn() my_zero_pfn() functions under CONFIG_MMU, because
      zero_pfn that they are using is declared in memory.c which is compiled
      with CONFIG_MMU.
      
      Link: https://lkml.kernel.org/r/20210215161349.246722-9-pasha.tatashin@soleen.comSigned-off-by: NPavel Tatashin <pasha.tatashin@soleen.com>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: David Hildenbrand <david@redhat.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Ira Weiny <ira.weiny@intel.com>
      Cc: James Morris <jmorris@namei.org>
      Cc: Jason Gunthorpe <jgg@nvidia.com>
      Cc: Jason Gunthorpe <jgg@ziepe.ca>
      Cc: John Hubbard <jhubbard@nvidia.com>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Cc: Matthew Wilcox <willy@infradead.org>
      Cc: Mel Gorman <mgorman@suse.de>
      Cc: Michal Hocko <mhocko@kernel.org>
      Cc: Michal Hocko <mhocko@suse.com>
      Cc: Mike Kravetz <mike.kravetz@oracle.com>
      Cc: Oscar Salvador <osalvador@suse.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Sasha Levin <sashal@kernel.org>
      Cc: Steven Rostedt (VMware) <rostedt@goodmis.org>
      Cc: Tyler Hicks <tyhicks@linux.microsoft.com>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      9afaf30f
  3. 10 3月, 2021 1 次提交
  4. 27 2月, 2021 1 次提交
  5. 20 1月, 2021 1 次提交
  6. 03 12月, 2020 2 次提交
  7. 16 11月, 2020 1 次提交
    • A
      arch: pgtable: define MAX_POSSIBLE_PHYSMEM_BITS where needed · cef39703
      Arnd Bergmann 提交于
      Stefan Agner reported a bug when using zsram on 32-bit Arm machines
      with RAM above the 4GB address boundary:
      
        Unable to handle kernel NULL pointer dereference at virtual address 00000000
        pgd = a27bd01c
        [00000000] *pgd=236a0003, *pmd=1ffa64003
        Internal error: Oops: 207 [#1] SMP ARM
        Modules linked in: mdio_bcm_unimac(+) brcmfmac cfg80211 brcmutil raspberrypi_hwmon hci_uart crc32_arm_ce bcm2711_thermal phy_generic genet
        CPU: 0 PID: 123 Comm: mkfs.ext4 Not tainted 5.9.6 #1
        Hardware name: BCM2711
        PC is at zs_map_object+0x94/0x338
        LR is at zram_bvec_rw.constprop.0+0x330/0xa64
        pc : [<c0602b38>]    lr : [<c0bda6a0>]    psr: 60000013
        sp : e376bbe0  ip : 00000000  fp : c1e2921c
        r10: 00000002  r9 : c1dda730  r8 : 00000000
        r7 : e8ff7a00  r6 : 00000000  r5 : 02f9ffa0  r4 : e3710000
        r3 : 000fdffe  r2 : c1e0ce80  r1 : ebf979a0  r0 : 00000000
        Flags: nZCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment user
        Control: 30c5383d  Table: 235c2a80  DAC: fffffffd
        Process mkfs.ext4 (pid: 123, stack limit = 0x495a22e6)
        Stack: (0xe376bbe0 to 0xe376c000)
      
      As it turns out, zsram needs to know the maximum memory size, which
      is defined in MAX_PHYSMEM_BITS when CONFIG_SPARSEMEM is set, or in
      MAX_POSSIBLE_PHYSMEM_BITS on the x86 architecture.
      
      The same problem will be hit on all 32-bit architectures that have a
      physical address space larger than 4GB and happen to not enable sparsemem
      and include asm/sparsemem.h from asm/pgtable.h.
      
      After the initial discussion, I suggested just always defining
      MAX_POSSIBLE_PHYSMEM_BITS whenever CONFIG_PHYS_ADDR_T_64BIT is
      set, or provoking a build error otherwise. This addresses all
      configurations that can currently have this runtime bug, but
      leaves all other configurations unchanged.
      
      I looked up the possible number of bits in source code and
      datasheets, here is what I found:
      
       - on ARC, CONFIG_ARC_HAS_PAE40 controls whether 32 or 40 bits are used
       - on ARM, CONFIG_LPAE enables 40 bit addressing, without it we never
         support more than 32 bits, even though supersections in theory allow
         up to 40 bits as well.
       - on MIPS, some MIPS32r1 or later chips support 36 bits, and MIPS32r5
         XPA supports up to 60 bits in theory, but 40 bits are more than
         anyone will ever ship
       - On PowerPC, there are three different implementations of 36 bit
         addressing, but 32-bit is used without CONFIG_PTE_64BIT
       - On RISC-V, the normal page table format can support 34 bit
         addressing. There is no highmem support on RISC-V, so anything
         above 2GB is unused, but it might be useful to eventually support
         CONFIG_ZRAM for high pages.
      
      Fixes: 61989a80 ("staging: zsmalloc: zsmalloc memory allocation library")
      Fixes: 02390b87 ("mm/zsmalloc: Prepare to variable MAX_PHYSMEM_BITS")
      Acked-by: NThomas Bogendoerfer <tsbogend@alpha.franken.de>
      Reviewed-by: NStefan Agner <stefan@agner.ch>
      Tested-by: NStefan Agner <stefan@agner.ch>
      Acked-by: NMike Rapoport <rppt@linux.ibm.com>
      Link: https://lore.kernel.org/linux-mm/bdfa44bf1c570b05d6c70898e2bbb0acf234ecdf.1604762181.git.stefan@agner.ch/Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      cef39703
  8. 03 11月, 2020 1 次提交
    • J
      mm: always have io_remap_pfn_range() set pgprot_decrypted() · f8f6ae5d
      Jason Gunthorpe 提交于
      The purpose of io_remap_pfn_range() is to map IO memory, such as a
      memory mapped IO exposed through a PCI BAR.  IO devices do not
      understand encryption, so this memory must always be decrypted.
      Automatically call pgprot_decrypted() as part of the generic
      implementation.
      
      This fixes a bug where enabling AMD SME causes subsystems, such as RDMA,
      using io_remap_pfn_range() to expose BAR pages to user space to fail.
      The CPU will encrypt access to those BAR pages instead of passing
      unencrypted IO directly to the device.
      
      Places not mapping IO should use remap_pfn_range().
      
      Fixes: aca20d54 ("x86/mm: Add support to make use of Secure Memory Encryption")
      Signed-off-by: NJason Gunthorpe <jgg@nvidia.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Tom Lendacky <thomas.lendacky@amd.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brijesh Singh <brijesh.singh@amd.com>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: Dmitry Vyukov <dvyukov@google.com>
      Cc: "Dave Young" <dyoung@redhat.com>
      Cc: Alexander Potapenko <glider@google.com>
      Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Larry Woodman <lwoodman@redhat.com>
      Cc: Matt Fleming <matt@codeblueprint.co.uk>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: "Michael S. Tsirkin" <mst@redhat.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Toshimitsu Kani <toshi.kani@hpe.com>
      Cc: <stable@vger.kernel.org>
      Link: https://lkml.kernel.org/r/0-v1-025d64bdf6c4+e-amd_sme_fix_jgg@nvidia.comSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      f8f6ae5d
  9. 27 9月, 2020 1 次提交
    • V
      mm/gup: fix gup_fast with dynamic page table folding · d3f7b1bb
      Vasily Gorbik 提交于
      Currently to make sure that every page table entry is read just once
      gup_fast walks perform READ_ONCE and pass pXd value down to the next
      gup_pXd_range function by value e.g.:
      
        static int gup_pud_range(p4d_t p4d, unsigned long addr, unsigned long end,
                                 unsigned int flags, struct page **pages, int *nr)
        ...
                pudp = pud_offset(&p4d, addr);
      
      This function passes a reference on that local value copy to pXd_offset,
      and might get the very same pointer in return.  This happens when the
      level is folded (on most arches), and that pointer should not be
      iterated.
      
      On s390 due to the fact that each task might have different 5,4 or
      3-level address translation and hence different levels folded the logic
      is more complex and non-iteratable pointer to a local copy leads to
      severe problems.
      
      Here is an example of what happens with gup_fast on s390, for a task
      with 3-level paging, crossing a 2 GB pud boundary:
      
        // addr = 0x1007ffff000, end = 0x10080001000
        static int gup_pud_range(p4d_t p4d, unsigned long addr, unsigned long end,
                                 unsigned int flags, struct page **pages, int *nr)
        {
              unsigned long next;
              pud_t *pudp;
      
              // pud_offset returns &p4d itself (a pointer to a value on stack)
              pudp = pud_offset(&p4d, addr);
              do {
                      // on second iteratation reading "random" stack value
                      pud_t pud = READ_ONCE(*pudp);
      
                      // next = 0x10080000000, due to PUD_SIZE/MASK != PGDIR_SIZE/MASK on s390
                      next = pud_addr_end(addr, end);
                      ...
              } while (pudp++, addr = next, addr != end); // pudp++ iterating over stack
      
              return 1;
        }
      
      This happens since s390 moved to common gup code with commit
      d1874a0c ("s390/mm: make the pxd_offset functions more robust") and
      commit 1a42010c ("s390/mm: convert to the generic
      get_user_pages_fast code").
      
      s390 tried to mimic static level folding by changing pXd_offset
      primitives to always calculate top level page table offset in pgd_offset
      and just return the value passed when pXd_offset has to act as folded.
      
      What is crucial for gup_fast and what has been overlooked is that
      PxD_SIZE/MASK and thus pXd_addr_end should also change correspondingly.
      And the latter is not possible with dynamic folding.
      
      To fix the issue in addition to pXd values pass original pXdp pointers
      down to gup_pXd_range functions.  And introduce pXd_offset_lockless
      helpers, which take an additional pXd entry value parameter.  This has
      already been discussed in
      
        https://lkml.kernel.org/r/20190418100218.0a4afd51@mschwideX1
      
      Fixes: 1a42010c ("s390/mm: convert to the generic get_user_pages_fast code")
      Signed-off-by: NVasily Gorbik <gor@linux.ibm.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Reviewed-by: NGerald Schaefer <gerald.schaefer@linux.ibm.com>
      Reviewed-by: NAlexander Gordeev <agordeev@linux.ibm.com>
      Reviewed-by: NJason Gunthorpe <jgg@nvidia.com>
      Reviewed-by: NMike Rapoport <rppt@linux.ibm.com>
      Reviewed-by: NJohn Hubbard <jhubbard@nvidia.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Russell King <linux@armlinux.org.uk>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Will Deacon <will@kernel.org>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Jeff Dike <jdike@addtoit.com>
      Cc: Richard Weinberger <richard@nod.at>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
      Cc: Heiko Carstens <hca@linux.ibm.com>
      Cc: Christian Borntraeger <borntraeger@de.ibm.com>
      Cc: Claudio Imbrenda <imbrenda@linux.ibm.com>
      Cc: <stable@vger.kernel.org>	[5.2+]
      Link: https://lkml.kernel.org/r/patch.git-943f1e5dcff2.your-ad-here.call-01599856292-ext-8676@work.hoursSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      d3f7b1bb
  10. 04 9月, 2020 1 次提交
    • S
      mm: Add arch hooks for saving/restoring tags · 8a84802e
      Steven Price 提交于
      Arm's Memory Tagging Extension (MTE) adds some metadata (tags) to
      every physical page, when swapping pages out to disk it is necessary to
      save these tags, and later restore them when reading the pages back.
      
      Add some hooks along with dummy implementations to enable the
      arch code to handle this.
      
      Three new hooks are added to the swap code:
       * arch_prepare_to_swap() and
       * arch_swap_invalidate_page() / arch_swap_invalidate_area().
      One new hook is added to shmem:
       * arch_swap_restore()
      Signed-off-by: NSteven Price <steven.price@arm.com>
      [catalin.marinas@arm.com: add unlock_page() on the error path]
      [catalin.marinas@arm.com: dropped the _tags suffix]
      Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
      Acked-by: NAndrew Morton <akpm@linux-foundation.org>
      8a84802e
  11. 18 8月, 2020 1 次提交
  12. 13 8月, 2020 1 次提交
  13. 31 7月, 2020 1 次提交
  14. 20 6月, 2020 1 次提交
  15. 10 6月, 2020 4 次提交
    • M
      mmap locking API: convert mmap_sem comments · c1e8d7c6
      Michel Lespinasse 提交于
      Convert comments that reference mmap_sem to reference mmap_lock instead.
      
      [akpm@linux-foundation.org: fix up linux-next leftovers]
      [akpm@linux-foundation.org: s/lockaphore/lock/, per Vlastimil]
      [akpm@linux-foundation.org: more linux-next fixups, per Michel]
      Signed-off-by: NMichel Lespinasse <walken@google.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Reviewed-by: NVlastimil Babka <vbabka@suse.cz>
      Reviewed-by: NDaniel Jordan <daniel.m.jordan@oracle.com>
      Cc: Davidlohr Bueso <dbueso@suse.de>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: Jason Gunthorpe <jgg@ziepe.ca>
      Cc: Jerome Glisse <jglisse@redhat.com>
      Cc: John Hubbard <jhubbard@nvidia.com>
      Cc: Laurent Dufour <ldufour@linux.ibm.com>
      Cc: Liam Howlett <Liam.Howlett@oracle.com>
      Cc: Matthew Wilcox <willy@infradead.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ying Han <yinghan@google.com>
      Link: http://lkml.kernel.org/r/20200520052908.204642-13-walken@google.comSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      c1e8d7c6
    • M
      mm: consolidate pte_index() and pte_offset_*() definitions · 974b9b2c
      Mike Rapoport 提交于
      All architectures define pte_index() as
      
      	(address >> PAGE_SHIFT) & (PTRS_PER_PTE - 1)
      
      and all architectures define pte_offset_kernel() as an entry in the array
      of PTEs indexed by the pte_index().
      
      For the most architectures the pte_offset_kernel() implementation relies
      on the availability of pmd_page_vaddr() that converts a PMD entry value to
      the virtual address of the page containing PTEs array.
      
      Let's move x86 definitions of the PTE accessors to the generic place in
      <linux/pgtable.h> and then simply drop the respective definitions from the
      other architectures.
      
      The architectures that didn't provide pmd_page_vaddr() are updated to have
      that defined.
      
      The generic implementation of pte_offset_kernel() can be overridden by an
      architecture and alpha makes use of this because it has special ordering
      requirements for its version of pte_offset_kernel().
      
      [rppt@linux.ibm.com: v2]
        Link: http://lkml.kernel.org/r/20200514170327.31389-11-rppt@kernel.org
      [rppt@linux.ibm.com: update]
        Link: http://lkml.kernel.org/r/20200514170327.31389-12-rppt@kernel.org
      [rppt@linux.ibm.com: update]
        Link: http://lkml.kernel.org/r/20200514170327.31389-13-rppt@kernel.org
      [akpm@linux-foundation.org: fix x86 warning]
      [sfr@canb.auug.org.au: fix powerpc build]
        Link: http://lkml.kernel.org/r/20200607153443.GB738695@linux.ibm.comSigned-off-by: NMike Rapoport <rppt@linux.ibm.com>
      Signed-off-by: NStephen Rothwell <sfr@canb.auug.org.au>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Cain <bcain@codeaurora.org>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Chris Zankel <chris@zankel.net>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: Greentime Hu <green.hu@gmail.com>
      Cc: Greg Ungerer <gerg@linux-m68k.org>
      Cc: Guan Xuetao <gxt@pku.edu.cn>
      Cc: Guo Ren <guoren@kernel.org>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Helge Deller <deller@gmx.de>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Ley Foon Tan <ley.foon.tan@intel.com>
      Cc: Mark Salter <msalter@redhat.com>
      Cc: Matthew Wilcox <willy@infradead.org>
      Cc: Matt Turner <mattst88@gmail.com>
      Cc: Max Filippov <jcmvbkbc@gmail.com>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Michal Simek <monstr@monstr.eu>
      Cc: Nick Hu <nickhu@andestech.com>
      Cc: Paul Walmsley <paul.walmsley@sifive.com>
      Cc: Richard Weinberger <richard@nod.at>
      Cc: Rich Felker <dalias@libc.org>
      Cc: Russell King <linux@armlinux.org.uk>
      Cc: Stafford Horne <shorne@gmail.com>
      Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Vincent Chen <deanbo422@gmail.com>
      Cc: Vineet Gupta <vgupta@synopsys.com>
      Cc: Will Deacon <will@kernel.org>
      Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
      Link: http://lkml.kernel.org/r/20200514170327.31389-10-rppt@kernel.orgSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      974b9b2c
    • M
      mm: pgtable: add shortcuts for accessing kernel PMD and PTE · e05c7b1f
      Mike Rapoport 提交于
      The powerpc 32-bit implementation of pgtable has nice shortcuts for
      accessing kernel PMD and PTE for a given virtual address.  Make these
      helpers available for all architectures.
      
      [rppt@linux.ibm.com: microblaze: fix page table traversal in setup_rt_frame()]
        Link: http://lkml.kernel.org/r/20200518191511.GD1118872@kernel.org
      [akpm@linux-foundation.org: s/pmd_ptr_k/pmd_off_k/ in various powerpc places]
      Signed-off-by: NMike Rapoport <rppt@linux.ibm.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Cain <bcain@codeaurora.org>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Chris Zankel <chris@zankel.net>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: Greentime Hu <green.hu@gmail.com>
      Cc: Greg Ungerer <gerg@linux-m68k.org>
      Cc: Guan Xuetao <gxt@pku.edu.cn>
      Cc: Guo Ren <guoren@kernel.org>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Helge Deller <deller@gmx.de>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Ley Foon Tan <ley.foon.tan@intel.com>
      Cc: Mark Salter <msalter@redhat.com>
      Cc: Matthew Wilcox <willy@infradead.org>
      Cc: Matt Turner <mattst88@gmail.com>
      Cc: Max Filippov <jcmvbkbc@gmail.com>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Michal Simek <monstr@monstr.eu>
      Cc: Nick Hu <nickhu@andestech.com>
      Cc: Paul Walmsley <paul.walmsley@sifive.com>
      Cc: Richard Weinberger <richard@nod.at>
      Cc: Rich Felker <dalias@libc.org>
      Cc: Russell King <linux@armlinux.org.uk>
      Cc: Stafford Horne <shorne@gmail.com>
      Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Vincent Chen <deanbo422@gmail.com>
      Cc: Vineet Gupta <vgupta@synopsys.com>
      Cc: Will Deacon <will@kernel.org>
      Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
      Link: http://lkml.kernel.org/r/20200514170327.31389-9-rppt@kernel.orgSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      e05c7b1f
    • M
      mm: introduce include/linux/pgtable.h · ca5999fd
      Mike Rapoport 提交于
      The include/linux/pgtable.h is going to be the home of generic page table
      manipulation functions.
      
      Start with moving asm-generic/pgtable.h to include/linux/pgtable.h and
      make the latter include asm/pgtable.h.
      Signed-off-by: NMike Rapoport <rppt@linux.ibm.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Cain <bcain@codeaurora.org>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Chris Zankel <chris@zankel.net>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: Greentime Hu <green.hu@gmail.com>
      Cc: Greg Ungerer <gerg@linux-m68k.org>
      Cc: Guan Xuetao <gxt@pku.edu.cn>
      Cc: Guo Ren <guoren@kernel.org>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Helge Deller <deller@gmx.de>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Ley Foon Tan <ley.foon.tan@intel.com>
      Cc: Mark Salter <msalter@redhat.com>
      Cc: Matthew Wilcox <willy@infradead.org>
      Cc: Matt Turner <mattst88@gmail.com>
      Cc: Max Filippov <jcmvbkbc@gmail.com>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Michal Simek <monstr@monstr.eu>
      Cc: Nick Hu <nickhu@andestech.com>
      Cc: Paul Walmsley <paul.walmsley@sifive.com>
      Cc: Richard Weinberger <richard@nod.at>
      Cc: Rich Felker <dalias@libc.org>
      Cc: Russell King <linux@armlinux.org.uk>
      Cc: Stafford Horne <shorne@gmail.com>
      Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Vincent Chen <deanbo422@gmail.com>
      Cc: Vineet Gupta <vgupta@synopsys.com>
      Cc: Will Deacon <will@kernel.org>
      Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
      Link: http://lkml.kernel.org/r/20200514170327.31389-3-rppt@kernel.orgSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      ca5999fd
  16. 03 6月, 2020 2 次提交
    • J
      mm: add functions to track page directory modifications · d8626138
      Joerg Roedel 提交于
      Patch series "mm: Get rid of vmalloc_sync_(un)mappings()", v3.
      
      After the recent issue with vmalloc and tracing code[1] on x86 and a
      long history of previous issues related to the vmalloc_sync_mappings()
      interface, I thought the time has come to remove it.  Please see [2],
      [3], and [4] for some other issues in the past.
      
      The patches add tracking of page-table directory changes to the vmalloc
      and ioremap code.  Depending on which page-table levels changes have
      been made, a new per-arch function is called:
      arch_sync_kernel_mappings().
      
      On x86-64 with 4-level paging, this function will not be called more
      than 64 times in a systems runtime (because vmalloc-space takes 64 PGD
      entries which are only populated, but never cleared).
      
      As a side effect this also allows to get rid of vmalloc faults on x86,
      making it safe to touch vmalloc'ed memory in the page-fault handler.
      Note that this potentially includes per-cpu memory.
      
      This patch (of 7):
      
      Add page-table allocation functions which will keep track of changed
      directory entries.  They are needed for new PGD, P4D, PUD, and PMD
      entries and will be used in vmalloc and ioremap code to decide whether
      any changes in the kernel mappings need to be synchronized between
      page-tables in the system.
      Signed-off-by: NJoerg Roedel <jroedel@suse.de>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Acked-by: NAndy Lutomirski <luto@kernel.org>
      Acked-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: "H . Peter Anvin" <hpa@zytor.com>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Steven Rostedt (VMware) <rostedt@goodmis.org>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Cc: Michal Hocko <mhocko@kernel.org>
      Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Christoph Hellwig <hch@lst.de>
      Link: http://lkml.kernel.org/r/20200515140023.25469-1-joro@8bytes.org
      Link: http://lkml.kernel.org/r/20200515140023.25469-2-joro@8bytes.orgSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      d8626138
    • C
      mm: enforce that vmap can't map pages executable · cca98e9f
      Christoph Hellwig 提交于
      To help enforcing the W^X protection don't allow remapping existing pages
      as executable.
      
      x86 bits from Peter Zijlstra, arm64 bits from Mark Rutland.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Acked-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Mark Rutland <mark.rutland@arm.com>.
      Cc: Christian Borntraeger <borntraeger@de.ibm.com>
      Cc: Christophe Leroy <christophe.leroy@c-s.fr>
      Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
      Cc: David Airlie <airlied@linux.ie>
      Cc: Gao Xiang <xiang@kernel.org>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Haiyang Zhang <haiyangz@microsoft.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: "K. Y. Srinivasan" <kys@microsoft.com>
      Cc: Laura Abbott <labbott@redhat.com>
      Cc: Michael Kelley <mikelley@microsoft.com>
      Cc: Minchan Kim <minchan@kernel.org>
      Cc: Nitin Gupta <ngupta@vflare.org>
      Cc: Robin Murphy <robin.murphy@arm.com>
      Cc: Sakari Ailus <sakari.ailus@linux.intel.com>
      Cc: Stephen Hemminger <sthemmin@microsoft.com>
      Cc: Sumit Semwal <sumit.semwal@linaro.org>
      Cc: Wei Liu <wei.liu@kernel.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Paul Mackerras <paulus@ozlabs.org>
      Cc: Vasily Gorbik <gor@linux.ibm.com>
      Cc: Will Deacon <will@kernel.org>
      Link: http://lkml.kernel.org/r/20200414131348.444715-20-hch@lst.deSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      cca98e9f
  17. 27 5月, 2020 2 次提交
  18. 05 5月, 2020 1 次提交
  19. 08 4月, 2020 1 次提交
  20. 04 2月, 2020 1 次提交
    • S
      mm: add generic p?d_leaf() macros · 93fab1b2
      Steven Price 提交于
      Patch series "Generic page walk and ptdump", v17.
      
      Many architectures current have a debugfs file for dumping the kernel page
      tables.  Currently each architecture has to implement custom functions for
      this because the details of walking the page tables used by the kernel are
      different between architectures.
      
      This series extends the capabilities of walk_page_range() so that it can
      deal with the page tables of the kernel (which have no VMAs and can
      contain larger huge pages than exist for user space).  A generic PTDUMP
      implementation is the implemented making use of the new functionality of
      walk_page_range() and finally arm64 and x86 are switch to using it,
      removing the custom table walkers.
      
      To enable a generic page table walker to walk the unusual mappings of the
      kernel we need to implement a set of functions which let us know when the
      walker has reached the leaf entry.  After a suggestion from Will Deacon
      I've chosen the name p?d_leaf() as this (hopefully) describes the purpose
      (and is a new name so has no historic baggage).  Some architectures have
      p?d_large macros but this is easily confused with "large pages".
      
      This series ends with a generic PTDUMP implemention for arm64 and x86.
      
      Mostly this is a clean up and there should be very little functional
      change.  The exceptions are:
      
      * arm64 PTDUMP debugfs now displays pages which aren't present (patch 22).
      
      * arm64 has the ability to efficiently process KASAN pages (which
        previously only x86 implemented).  This means that the combination of
        KASAN and DEBUG_WX is now useable.
      
      This patch (of 23):
      
      Exposing the pud/pgd levels of the page tables to walk_page_range() means
      we may come across the exotic large mappings that come with large areas of
      contiguous memory (such as the kernel's linear map).
      
      For architectures that don't provide all p?d_leaf() macros, provide
      generic do nothing default that are suitable where there cannot be leaf
      pages at that level.  Futher patches will add implementations for
      individual architectures.
      
      The name p?d_leaf() is chosen to minimize the confusion with existing uses
      of "large" pages and "huge" pages which do not necessary mean that the
      entry is a leaf (for example it may be a set of contiguous entries that
      only take 1 TLB slot).  For the purpose of walking the page tables we
      don't need to know how it will be represented in the TLB, but we do need
      to know for sure if it is a leaf of the tree.
      
      Link: http://lkml.kernel.org/r/20191218162402.45610-2-steven.price@arm.comSigned-off-by: NSteven Price <steven.price@arm.com>
      Acked-by: NMark Rutland <mark.rutland@arm.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: James Morse <james.morse@arm.com>
      Cc: Jerome Glisse <jglisse@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Will Deacon <will@kernel.org>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: "Liang, Kan" <kan.liang@linux.intel.com>
      Cc: Albert Ou <aou@eecs.berkeley.edu>
      Cc: Alexandre Ghiti <alex@ghiti.fr>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Christian Borntraeger <borntraeger@de.ibm.com>
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: James Hogan <jhogan@kernel.org>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Paul Burton <paul.burton@mips.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Paul Walmsley <paul.walmsley@sifive.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Russell King <linux@armlinux.org.uk>
      Cc: Vasily Gorbik <gor@linux.ibm.com>
      Cc: Vineet Gupta <vgupta@synopsys.com>
      Cc: Zong Li <zong.li@sifive.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      93fab1b2
  21. 01 12月, 2019 3 次提交
  22. 25 9月, 2019 2 次提交
  23. 06 5月, 2019 1 次提交
  24. 06 3月, 2019 2 次提交
  25. 29 12月, 2018 1 次提交
  26. 05 12月, 2018 3 次提交
  27. 02 11月, 2018 1 次提交
  28. 27 10月, 2018 1 次提交