1. 08 5月, 2020 1 次提交
    • M
      arm64: Set GP bit in kernel page tables to enable BTI for the kernel · c8027285
      Mark Brown 提交于
      Now that the kernel is built with BTI annotations enable the feature by
      setting the GP bit in the stage 1 translation tables.  This is done
      based on the features supported by the boot CPU so that we do not need
      to rewrite the translation tables.
      
      In order to avoid potential issues on big.LITTLE systems when there are
      a mix of BTI and non-BTI capable CPUs in the system when we have enabled
      kernel mode BTI we change BTI to be a _STRICT_BOOT_CPU_FEATURE when we
      have kernel BTI.  This will prevent any CPUs that don't support BTI
      being started if the boot CPU supports BTI rather than simply not using
      BTI as we do when supporting BTI only in userspace.  The main concern is
      the possibility of BTYPE being preserved by a CPU that does not
      implement BTI when a thread is migrated to it resulting in an incorrect
      state which could generate an exception when the thread migrates back to
      a CPU that does support BTI.  If we encounter practical systems which
      mix BTI and non-BTI CPUs we will need to revisit this implementation.
      
      Since we currently do not generate landing pads in the BPF JIT we only
      map the base kernel text in this way.
      Signed-off-by: NMark Brown <broonie@kernel.org>
      Reviewed-by: NCatalin Marinas <catalin.marinas@arm.com>
      Link: https://lore.kernel.org/r/20200506195138.22086-5-broonie@kernel.orgSigned-off-by: NWill Deacon <will@kernel.org>
      c8027285
  2. 11 4月, 2020 2 次提交
    • L
      mm/memory_hotplug: add pgprot_t to mhp_params · bfeb022f
      Logan Gunthorpe 提交于
      devm_memremap_pages() is currently used by the PCI P2PDMA code to create
      struct page mappings for IO memory.  At present, these mappings are
      created with PAGE_KERNEL which implies setting the PAT bits to be WB.
      However, on x86, an mtrr register will typically override this and force
      the cache type to be UC-.  In the case firmware doesn't set this
      register it is effectively WB and will typically result in a machine
      check exception when it's accessed.
      
      Other arches are not currently likely to function correctly seeing they
      don't have any MTRR registers to fall back on.
      
      To solve this, provide a way to specify the pgprot value explicitly to
      arch_add_memory().
      
      Of the arches that support MEMORY_HOTPLUG: x86_64, and arm64 need a
      simple change to pass the pgprot_t down to their respective functions
      which set up the page tables.  For x86_32, set the page tables
      explicitly using _set_memory_prot() (seeing they are already mapped).
      
      For ia64, s390 and sh, reject anything but PAGE_KERNEL settings -- this
      should be fine, for now, seeing these architectures don't support
      ZONE_DEVICE.
      
      A check in __add_pages() is also added to ensure the pgprot parameter
      was set for all arches.
      Signed-off-by: NLogan Gunthorpe <logang@deltatee.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Acked-by: NDavid Hildenbrand <david@redhat.com>
      Acked-by: NMichal Hocko <mhocko@suse.com>
      Acked-by: NDan Williams <dan.j.williams@intel.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Eric Badger <ebadger@gigaio.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jason Gunthorpe <jgg@ziepe.ca>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Will Deacon <will@kernel.org>
      Link: http://lkml.kernel.org/r/20200306170846.9333-7-logang@deltatee.comSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      bfeb022f
    • L
      mm/memory_hotplug: rename mhp_restrictions to mhp_params · f5637d3b
      Logan Gunthorpe 提交于
      The mhp_restrictions struct really doesn't specify anything resembling a
      restriction anymore so rename it to be mhp_params as it is a list of
      extended parameters.
      Signed-off-by: NLogan Gunthorpe <logang@deltatee.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Reviewed-by: NDavid Hildenbrand <david@redhat.com>
      Reviewed-by: NDan Williams <dan.j.williams@intel.com>
      Acked-by: NMichal Hocko <mhocko@suse.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Eric Badger <ebadger@gigaio.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jason Gunthorpe <jgg@ziepe.ca>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Will Deacon <will@kernel.org>
      Link: http://lkml.kernel.org/r/20200306170846.9333-3-logang@deltatee.comSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      f5637d3b
  3. 04 3月, 2020 1 次提交
    • A
      arm64/mm: Enable memory hot remove · bbd6ec60
      Anshuman Khandual 提交于
      The arch code for hot-remove must tear down portions of the linear map and
      vmemmap corresponding to memory being removed. In both cases the page
      tables mapping these regions must be freed, and when sparse vmemmap is in
      use the memory backing the vmemmap must also be freed.
      
      This patch adds unmap_hotplug_range() and free_empty_tables() helpers which
      can be used to tear down either region and calls it from vmemmap_free() and
      ___remove_pgd_mapping(). The free_mapped argument determines whether the
      backing memory will be freed.
      
      It makes two distinct passes over the kernel page table. In the first pass
      with unmap_hotplug_range() it unmaps, invalidates applicable TLB cache and
      frees backing memory if required (vmemmap) for each mapped leaf entry. In
      the second pass with free_empty_tables() it looks for empty page table
      sections whose page table page can be unmapped, TLB invalidated and freed.
      
      While freeing intermediate level page table pages bail out if any of its
      entries are still valid. This can happen for partially filled kernel page
      table either from a previously attempted failed memory hot add or while
      removing an address range which does not span the entire page table page
      range.
      
      The vmemmap region may share levels of table with the vmalloc region.
      There can be conflicts between hot remove freeing page table pages with
      a concurrent vmalloc() walking the kernel page table. This conflict can
      not just be solved by taking the init_mm ptl because of existing locking
      scheme in vmalloc(). So free_empty_tables() implements a floor and ceiling
      method which is borrowed from user page table tear with free_pgd_range()
      which skips freeing page table pages if intermediate address range is not
      aligned or maximum floor-ceiling might not own the entire page table page.
      
      Boot memory on arm64 cannot be removed. Hence this registers a new memory
      hotplug notifier which prevents boot memory offlining and it's removal.
      
      While here update arch_add_memory() to handle __add_pages() failures by
      just unmapping recently added kernel linear mapping. Now enable memory hot
      remove on arm64 platforms by default with ARCH_ENABLE_MEMORY_HOTREMOVE.
      
      This implementation is overall inspired from kernel page table tear down
      procedure on X86 architecture and user page table tear down method.
      
      [Mike and Catalin added P4D page table level support]
      Reviewed-by: NCatalin Marinas <catalin.marinas@arm.com>
      Signed-off-by: NMike Rapoport <rppt@linux.ibm.com>
      Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
      Signed-off-by: NAnshuman Khandual <anshuman.khandual@arm.com>
      Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
      bbd6ec60
  4. 04 2月, 2020 1 次提交
    • S
      arm64: mm: convert mm/dump.c to use walk_page_range() · 102f45fd
      Steven Price 提交于
      Now walk_page_range() can walk kernel page tables, we can switch the arm64
      ptdump code over to using it, simplifying the code.
      
      Link: http://lkml.kernel.org/r/20191218162402.45610-22-steven.price@arm.comSigned-off-by: NSteven Price <steven.price@arm.com>
      Reviewed-by: NCatalin Marinas <catalin.marinas@arm.com>
      Cc: Albert Ou <aou@eecs.berkeley.edu>
      Cc: Alexandre Ghiti <alex@ghiti.fr>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Christian Borntraeger <borntraeger@de.ibm.com>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: James Hogan <jhogan@kernel.org>
      Cc: James Morse <james.morse@arm.com>
      Cc: Jerome Glisse <jglisse@redhat.com>
      Cc: "Liang, Kan" <kan.liang@linux.intel.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Paul Burton <paul.burton@mips.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Paul Walmsley <paul.walmsley@sifive.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Russell King <linux@armlinux.org.uk>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vasily Gorbik <gor@linux.ibm.com>
      Cc: Vineet Gupta <vgupta@synopsys.com>
      Cc: Will Deacon <will@kernel.org>
      Cc: Zong Li <zong.li@sifive.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      102f45fd
  5. 05 1月, 2020 1 次提交
    • D
      mm/memory_hotplug: shrink zones when offlining memory · feee6b29
      David Hildenbrand 提交于
      We currently try to shrink a single zone when removing memory.  We use
      the zone of the first page of the memory we are removing.  If that
      memmap was never initialized (e.g., memory was never onlined), we will
      read garbage and can trigger kernel BUGs (due to a stale pointer):
      
          BUG: unable to handle page fault for address: 000000000000353d
          #PF: supervisor write access in kernel mode
          #PF: error_code(0x0002) - not-present page
          PGD 0 P4D 0
          Oops: 0002 [#1] SMP PTI
          CPU: 1 PID: 7 Comm: kworker/u8:0 Not tainted 5.3.0-rc5-next-20190820+ #317
          Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.4
          Workqueue: kacpi_hotplug acpi_hotplug_work_fn
          RIP: 0010:clear_zone_contiguous+0x5/0x10
          Code: 48 89 c6 48 89 c3 e8 2a fe ff ff 48 85 c0 75 cf 5b 5d c3 c6 85 fd 05 00 00 01 5b 5d c3 0f 1f 840
          RSP: 0018:ffffad2400043c98 EFLAGS: 00010246
          RAX: 0000000000000000 RBX: 0000000200000000 RCX: 0000000000000000
          RDX: 0000000000200000 RSI: 0000000000140000 RDI: 0000000000002f40
          RBP: 0000000140000000 R08: 0000000000000000 R09: 0000000000000001
          R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000140000
          R13: 0000000000140000 R14: 0000000000002f40 R15: ffff9e3e7aff3680
          FS:  0000000000000000(0000) GS:ffff9e3e7bb00000(0000) knlGS:0000000000000000
          CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
          CR2: 000000000000353d CR3: 0000000058610000 CR4: 00000000000006e0
          DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
          DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
          Call Trace:
           __remove_pages+0x4b/0x640
           arch_remove_memory+0x63/0x8d
           try_remove_memory+0xdb/0x130
           __remove_memory+0xa/0x11
           acpi_memory_device_remove+0x70/0x100
           acpi_bus_trim+0x55/0x90
           acpi_device_hotplug+0x227/0x3a0
           acpi_hotplug_work_fn+0x1a/0x30
           process_one_work+0x221/0x550
           worker_thread+0x50/0x3b0
           kthread+0x105/0x140
           ret_from_fork+0x3a/0x50
          Modules linked in:
          CR2: 000000000000353d
      
      Instead, shrink the zones when offlining memory or when onlining failed.
      Introduce and use remove_pfn_range_from_zone(() for that.  We now
      properly shrink the zones, even if we have DIMMs whereby
      
       - Some memory blocks fall into no zone (never onlined)
      
       - Some memory blocks fall into multiple zones (offlined+re-onlined)
      
       - Multiple memory blocks that fall into different zones
      
      Drop the zone parameter (with a potential dubious value) from
      __remove_pages() and __remove_section().
      
      Link: http://lkml.kernel.org/r/20191006085646.5768-6-david@redhat.com
      Fixes: f1dd2cd1 ("mm, memory_hotplug: do not associate hotadded memory to zones until online")	[visible after d0dc12e8]
      Signed-off-by: NDavid Hildenbrand <david@redhat.com>
      Reviewed-by: NOscar Salvador <osalvador@suse.de>
      Cc: Michal Hocko <mhocko@suse.com>
      Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org>
      Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com>
      Cc: Pavel Tatashin <pasha.tatashin@soleen.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: Logan Gunthorpe <logang@deltatee.com>
      Cc: <stable@vger.kernel.org>	[5.0+]
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      feee6b29
  6. 07 11月, 2019 1 次提交
    • D
      arm/efi: EFI soft reservation to memblock · 16993c0f
      Dan Williams 提交于
      UEFI 2.8 defines an EFI_MEMORY_SP attribute bit to augment the
      interpretation of the EFI Memory Types as "reserved for a specific
      purpose".
      
      The proposed Linux behavior for specific purpose memory is that it is
      reserved for direct-access (device-dax) by default and not available for
      any kernel usage, not even as an OOM fallback.  Later, through udev
      scripts or another init mechanism, these device-dax claimed ranges can
      be reconfigured and hot-added to the available System-RAM with a unique
      node identifier. This device-dax management scheme implements "soft" in
      the "soft reserved" designation by allowing some or all of the
      reservation to be recovered as typical memory. This policy can be
      disabled at compile-time with CONFIG_EFI_SOFT_RESERVE=n, or runtime with
      efi=nosoftreserve.
      
      For this patch, update the ARM paths that consider
      EFI_CONVENTIONAL_MEMORY to optionally take the EFI_MEMORY_SP attribute
      into account as a reservation indicator. Publish the soft reservation as
      IORES_DESC_SOFT_RESERVED memory, similar to x86.
      
      (Based on an original patch by Ard)
      Reviewed-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
      Signed-off-by: NDan Williams <dan.j.williams@intel.com>
      Acked-by: NThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      16993c0f
  7. 06 11月, 2019 1 次提交
  8. 27 9月, 2019 1 次提交
    • M
      mm: treewide: clarify pgtable_page_{ctor,dtor}() naming · b4ed71f5
      Mark Rutland 提交于
      The naming of pgtable_page_{ctor,dtor}() seems to have confused a few
      people, and until recently arm64 used these erroneously/pointlessly for
      other levels of page table.
      
      To make it incredibly clear that these only apply to the PTE level, and to
      align with the naming of pgtable_pmd_page_{ctor,dtor}(), let's rename them
      to pgtable_pte_page_{ctor,dtor}().
      
      These changes were generated with the following shell script:
      
      ----
      git grep -lw 'pgtable_page_.tor' | while read FILE; do
          sed -i '{s/pgtable_page_ctor/pgtable_pte_page_ctor/}' $FILE;
          sed -i '{s/pgtable_page_dtor/pgtable_pte_page_dtor/}' $FILE;
      done
      ----
      
      ... with the documentation re-flowed to remain under 80 columns, and
      whitespace fixed up in macros to keep backslashes aligned.
      
      There should be no functional change as a result of this patch.
      
      Link: http://lkml.kernel.org/r/20190722141133.3116-1-mark.rutland@arm.comSigned-off-by: NMark Rutland <mark.rutland@arm.com>
      Reviewed-by: NMike Rapoport <rppt@linux.ibm.com>
      Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>	[m68k]
      Cc: Anshuman Khandual <anshuman.khandual@arm.com>
      Cc: Matthew Wilcox <willy@infradead.org>
      Cc: Michal Hocko <mhocko@suse.com>
      Cc: Yu Zhao <yuzhao@google.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      b4ed71f5
  9. 28 8月, 2019 1 次提交
  10. 23 8月, 2019 1 次提交
  11. 15 8月, 2019 1 次提交
    • M
      arm64: memory: rename VA_START to PAGE_END · 77ad4ce6
      Mark Rutland 提交于
      Prior to commit:
      
        14c127c9 ("arm64: mm: Flip kernel VA space")
      
      ... VA_START described the start of the TTBR1 address space for a given
      VA size described by VA_BITS, where all kernel mappings began.
      
      Since that commit, VA_START described a portion midway through the
      address space, where the linear map ends and other kernel mappings
      begin.
      
      To avoid confusion, let's rename VA_START to PAGE_END, making it clear
      that it's not the start of the TTBR1 address space and implying that
      it's related to PAGE_OFFSET. Comments and other mnemonics are updated
      accordingly, along with a typo fix in the decription of VMEMMAP_SIZE.
      
      There should be no functional change as a result of this patch.
      Signed-off-by: NMark Rutland <mark.rutland@arm.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Tested-by: NSteve Capper <steve.capper@arm.com>
      Reviewed-by: NSteve Capper <steve.capper@arm.com>
      Signed-off-by: NWill Deacon <will@kernel.org>
      77ad4ce6
  12. 09 8月, 2019 3 次提交
    • S
      arm64: mm: Remove vabits_user · 2c624fe6
      Steve Capper 提交于
      Previous patches have enabled 52-bit kernel + user VAs and there is no
      longer any scenario where user VA != kernel VA size.
      
      This patch removes the, now redundant, vabits_user variable and replaces
      usage with vabits_actual where appropriate.
      Reviewed-by: NCatalin Marinas <catalin.marinas@arm.com>
      Signed-off-by: NSteve Capper <steve.capper@arm.com>
      Signed-off-by: NWill Deacon <will@kernel.org>
      2c624fe6
    • S
      arm64: mm: Introduce vabits_actual · 5383cc6e
      Steve Capper 提交于
      In order to support 52-bit kernel addresses detectable at boot time, one
      needs to know the actual VA_BITS detected. A new variable vabits_actual
      is introduced in this commit and employed for the KVM hypervisor layout,
      KASAN, fault handling and phys-to/from-virt translation where there
      would normally be compile time constants.
      
      In order to maintain performance in phys_to_virt, another variable
      physvirt_offset is introduced.
      Reviewed-by: NCatalin Marinas <catalin.marinas@arm.com>
      Signed-off-by: NSteve Capper <steve.capper@arm.com>
      Signed-off-by: NWill Deacon <will@kernel.org>
      5383cc6e
    • S
      arm64: mm: Flip kernel VA space · 14c127c9
      Steve Capper 提交于
      In order to allow for a KASAN shadow that changes size at boot time, one
      must fix the KASAN_SHADOW_END for both 48 & 52-bit VAs and "grow" the
      start address. Also, it is highly desirable to maintain the same
      function addresses in the kernel .text between VA sizes. Both of these
      requirements necessitate us to flip the kernel address space halves s.t.
      the direct linear map occupies the lower addresses.
      
      This patch puts the direct linear map in the lower addresses of the
      kernel VA range and everything else in the higher ranges.
      
      We need to adjust:
       *) KASAN shadow region placement logic,
       *) KASAN_SHADOW_OFFSET computation logic,
       *) virt_to_phys, phys_to_virt checks,
       *) page table dumper.
      
      These are all small changes, that need to take place atomically, so they
      are bundled into this commit.
      
      As part of the re-arrangement, a guard region of 2MB (to preserve
      alignment for fixed map) is added after the vmemmap. Otherwise the
      vmemmap could intersect with IS_ERR pointers.
      Reviewed-by: NCatalin Marinas <catalin.marinas@arm.com>
      Signed-off-by: NSteve Capper <steve.capper@arm.com>
      Signed-off-by: NWill Deacon <will@kernel.org>
      14c127c9
  13. 19 7月, 2019 2 次提交
    • D
      mm/memory_hotplug: allow arch_remove_memory() without CONFIG_MEMORY_HOTREMOVE · 80ec922d
      David Hildenbrand 提交于
      We want to improve error handling while adding memory by allowing to use
      arch_remove_memory() and __remove_pages() even if
      CONFIG_MEMORY_HOTREMOVE is not set to e.g., implement something like:
      
      	arch_add_memory()
      	rc = do_something();
      	if (rc) {
      		arch_remove_memory();
      	}
      
      We won't get rid of CONFIG_MEMORY_HOTREMOVE for now, as it will require
      quite some dependencies for memory offlining.
      
      Link: http://lkml.kernel.org/r/20190527111152.16324-7-david@redhat.comSigned-off-by: NDavid Hildenbrand <david@redhat.com>
      Reviewed-by: NPavel Tatashin <pasha.tatashin@soleen.com>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
      Cc: Rich Felker <dalias@libc.org>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: "Rafael J. Wysocki" <rafael@kernel.org>
      Cc: Michal Hocko <mhocko@suse.com>
      Cc: David Hildenbrand <david@redhat.com>
      Cc: Oscar Salvador <osalvador@suse.com>
      Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
      Cc: Alex Deucher <alexander.deucher@amd.com>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Mark Brown <broonie@kernel.org>
      Cc: Chris Wilson <chris@chris-wilson.co.uk>
      Cc: Christophe Leroy <christophe.leroy@c-s.fr>
      Cc: Nicholas Piggin <npiggin@gmail.com>
      Cc: Vasily Gorbik <gor@linux.ibm.com>
      Cc: Rob Herring <robh@kernel.org>
      Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
      Cc: "mike.travis@hpe.com" <mike.travis@hpe.com>
      Cc: Andrew Banman <andrew.banman@hpe.com>
      Cc: Arun KS <arunks@codeaurora.org>
      Cc: Qian Cai <cai@lca.pw>
      Cc: Mathieu Malaterre <malat@debian.org>
      Cc: Baoquan He <bhe@redhat.com>
      Cc: Logan Gunthorpe <logang@deltatee.com>
      Cc: Anshuman Khandual <anshuman.khandual@arm.com>
      Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Chintan Pandya <cpandya@codeaurora.org>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Cc: Jun Yao <yaojun8558363@gmail.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>
      Cc: Oscar Salvador <osalvador@suse.de>
      Cc: Robin Murphy <robin.murphy@arm.com>
      Cc: Wei Yang <richard.weiyang@gmail.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Yu Zhao <yuzhao@google.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      80ec922d
    • D
      arm64/mm: add temporary arch_remove_memory() implementation · 22eb6346
      David Hildenbrand 提交于
      A proper arch_remove_memory() implementation is on its way, which also
      cleanly removes page tables in arch_add_memory() in case something goes
      wrong.
      
      As we want to use arch_remove_memory() in case something goes wrong
      during memory hotplug after arch_add_memory() finished, let's add a
      temporary hack that is sufficient enough until we get a proper
      implementation that cleans up page table entries.
      
      We will remove CONFIG_MEMORY_HOTREMOVE around this code in follow up
      patches.
      
      Link: http://lkml.kernel.org/r/20190527111152.16324-5-david@redhat.comSigned-off-by: NDavid Hildenbrand <david@redhat.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
      Cc: Chintan Pandya <cpandya@codeaurora.org>
      Cc: Mike Rapoport <rppt@linux.ibm.com>
      Cc: Jun Yao <yaojun8558363@gmail.com>
      Cc: Yu Zhao <yuzhao@google.com>
      Cc: Robin Murphy <robin.murphy@arm.com>
      Cc: Anshuman Khandual <anshuman.khandual@arm.com>
      Cc: Alex Deucher <alexander.deucher@amd.com>
      Cc: Andrew Banman <andrew.banman@hpe.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Arun KS <arunks@codeaurora.org>
      Cc: Baoquan He <bhe@redhat.com>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Christophe Leroy <christophe.leroy@c-s.fr>
      Cc: Chris Wilson <chris@chris-wilson.co.uk>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
      Cc: Logan Gunthorpe <logang@deltatee.com>
      Cc: Mark Brown <broonie@kernel.org>
      Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
      Cc: Mathieu Malaterre <malat@debian.org>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Michal Hocko <mhocko@suse.com>
      Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>
      Cc: "mike.travis@hpe.com" <mike.travis@hpe.com>
      Cc: Nicholas Piggin <npiggin@gmail.com>
      Cc: Oscar Salvador <osalvador@suse.com>
      Cc: Oscar Salvador <osalvador@suse.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Pavel Tatashin <pasha.tatashin@soleen.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Qian Cai <cai@lca.pw>
      Cc: "Rafael J. Wysocki" <rafael@kernel.org>
      Cc: Rich Felker <dalias@libc.org>
      Cc: Rob Herring <robh@kernel.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Vasily Gorbik <gor@linux.ibm.com>
      Cc: Wei Yang <richard.weiyang@gmail.com>
      Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      22eb6346
  14. 17 7月, 2019 1 次提交
  15. 13 7月, 2019 1 次提交
    • M
      arm64: switch to generic version of pte allocation · 50f11a8a
      Mike Rapoport 提交于
      The PTE allocations in arm64 are identical to the generic ones modulo the
      GFP flags.
      
      Using the generic pte_alloc_one() functions ensures that the user page
      tables are allocated with __GFP_ACCOUNT set.
      
      The arm64 definition of PGALLOC_GFP is removed and replaced with
      GFP_PGTABLE_USER for p[gum]d_alloc_one() for the user page tables and
      GFP_PGTABLE_KERNEL for the kernel page tables. The KVM memory cache is now
      using GFP_PGTABLE_USER.
      
      The mappings created with create_pgd_mapping() are now using
      GFP_PGTABLE_KERNEL.
      
      The conversion to the generic version of pte_free_kernel() removes the NULL
      check for pte.
      
      The pte_free() version on arm64 is identical to the generic one and
      can be simply dropped.
      
      [cai@lca.pw: fix a bogus GFP flag in pgd_alloc()]
        Link: https://lore.kernel.org/r/1559656836-24940-1-git-send-email-cai@lca.pw/
      [and fix it more]
        Link: https://lore.kernel.org/linux-mm/20190617151252.GF16810@rapoport-lnx/
      Link: http://lkml.kernel.org/r/1557296232-15361-5-git-send-email-rppt@linux.ibm.comSigned-off-by: NMike Rapoport <rppt@linux.ibm.com>
      Cc: Albert Ou <aou@eecs.berkeley.edu>
      Cc: Anshuman Khandual <anshuman.khandual@arm.com>
      Cc: Anton Ivanov <anton.ivanov@cambridgegreys.com>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: Greentime Hu <green.hu@gmail.com>
      Cc: Guan Xuetao <gxt@pku.edu.cn>
      Cc: Guo Ren <guoren@kernel.org>
      Cc: Guo Ren <ren_guo@c-sky.com>
      Cc: Helge Deller <deller@gmx.de>
      Cc: Ley Foon Tan <lftan@altera.com>
      Cc: Matthew Wilcox <willy@infradead.org>
      Cc: Matt Turner <mattst88@gmail.com>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Michal Hocko <mhocko@suse.com>
      Cc: Palmer Dabbelt <palmer@sifive.com>
      Cc: Paul Burton <paul.burton@mips.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Richard Kuo <rkuo@codeaurora.org>
      Cc: Richard Weinberger <richard@nod.at>
      Cc: Russell King <linux@armlinux.org.uk>
      Cc: Sam Creasey <sammy@sammy.net>
      Cc: Vincent Chen <deanbo422@gmail.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      50f11a8a
  16. 19 6月, 2019 1 次提交
  17. 07 6月, 2019 1 次提交
  18. 04 6月, 2019 2 次提交
  19. 16 5月, 2019 1 次提交
    • M
      arm64/mm: Inhibit huge-vmap with ptdump · 7ba36ecc
      Mark Rutland 提交于
      The arm64 ptdump code can race with concurrent modification of the
      kernel page tables. At the time this was added, this was sound as:
      
      * Modifications to leaf entries could result in stale information being
        logged, but would not result in a functional problem.
      
      * Boot time modifications to non-leaf entries (e.g. freeing of initmem)
        were performed when the ptdump code cannot be invoked.
      
      * At runtime, modifications to non-leaf entries only occurred in the
        vmalloc region, and these were strictly additive, as intermediate
        entries were never freed.
      
      However, since commit:
      
        commit 324420bf ("arm64: add support for ioremap() block mappings")
      
      ... it has been possible to create huge mappings in the vmalloc area at
      runtime, and as part of this existing intermediate levels of table my be
      removed and freed.
      
      It's possible for the ptdump code to race with this, and continue to
      walk tables which have been freed (and potentially poisoned or
      reallocated). As a result of this, the ptdump code may dereference bogus
      addresses, which could be fatal.
      
      Since huge-vmap is a TLB and memory optimization, we can disable it when
      the runtime ptdump code is in use to avoid this problem.
      
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Fixes: 324420bf ("arm64: add support for ioremap() block mappings")
      Acked-by: NArd Biesheuvel <ard.biesheuvel@arm.com>
      Signed-off-by: NMark Rutland <mark.rutland@arm.com>
      Signed-off-by: NAnshuman Khandual <anshuman.khandual@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      7ba36ecc
  20. 15 5月, 2019 2 次提交
  21. 09 4月, 2019 3 次提交
  22. 13 3月, 2019 1 次提交
    • M
      memblock: memblock_phys_alloc(): don't panic · ecc3e771
      Mike Rapoport 提交于
      Make the memblock_phys_alloc() function an inline wrapper for
      memblock_phys_alloc_range() and update the memblock_phys_alloc() callers
      to check the returned value and panic in case of error.
      
      Link: http://lkml.kernel.org/r/1548057848-15136-8-git-send-email-rppt@linux.ibm.comSigned-off-by: NMike Rapoport <rppt@linux.ibm.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Christophe Leroy <christophe.leroy@c-s.fr>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Dennis Zhou <dennis@kernel.org>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: Greentime Hu <green.hu@gmail.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Guan Xuetao <gxt@pku.edu.cn>
      Cc: Guo Ren <guoren@kernel.org>
      Cc: Guo Ren <ren_guo@c-sky.com>				[c-sky]
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Juergen Gross <jgross@suse.com>			[Xen]
      Cc: Mark Salter <msalter@redhat.com>
      Cc: Matt Turner <mattst88@gmail.com>
      Cc: Max Filippov <jcmvbkbc@gmail.com>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Michal Simek <monstr@monstr.eu>
      Cc: Paul Burton <paul.burton@mips.com>
      Cc: Petr Mladek <pmladek@suse.com>
      Cc: Richard Weinberger <richard@nod.at>
      Cc: Rich Felker <dalias@libc.org>
      Cc: Rob Herring <robh+dt@kernel.org>
      Cc: Rob Herring <robh@kernel.org>
      Cc: Russell King <linux@armlinux.org.uk>
      Cc: Stafford Horne <shorne@gmail.com>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Vineet Gupta <vgupta@synopsys.com>
      Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      ecc3e771
  23. 02 3月, 2019 1 次提交
  24. 22 1月, 2019 1 次提交
  25. 29 12月, 2018 2 次提交
  26. 12 12月, 2018 1 次提交
    • R
      arm64: Add memory hotplug support · 4ab21506
      Robin Murphy 提交于
      Wire up the basic support for hot-adding memory. Since memory hotplug
      is fairly tightly coupled to sparsemem, we tweak pfn_valid() to also
      cross-check the presence of a section in the manner of the generic
      implementation, before falling back to memblock to check for no-map
      regions within a present section as before. By having arch_add_memory(()
      create the linear mapping first, this then makes everything work in the
      way that __add_section() expects.
      
      We expect hotplug to be ACPI-driven, so the swapper_pg_dir updates
      should be safe from races by virtue of the global device hotplug lock.
      Signed-off-by: NRobin Murphy <robin.murphy@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      4ab21506
  27. 11 12月, 2018 2 次提交
    • W
      arm64: mm: EXPORT vabits_user to modules · 4a1daf29
      Will Deacon 提交于
      TASK_SIZE is defined using the vabits_user variable for 64-bit tasks,
      so ensure that this variable is exported to modules to avoid the
      following build breakage with allmodconfig:
      
       | ERROR: "vabits_user" [lib/test_user_copy.ko] undefined!
       | ERROR: "vabits_user" [drivers/misc/lkdtm/lkdtm.ko] undefined!
       | ERROR: "vabits_user" [drivers/infiniband/hw/mlx5/mlx5_ib.ko] undefined!
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      4a1daf29
    • S
      arm64: mm: introduce 52-bit userspace support · 67e7fdfc
      Steve Capper 提交于
      On arm64 there is optional support for a 52-bit virtual address space.
      To exploit this one has to be running with a 64KB page size and be
      running on hardware that supports this.
      
      For an arm64 kernel supporting a 48 bit VA with a 64KB page size,
      some changes are needed to support a 52-bit userspace:
       * TCR_EL1.T0SZ needs to be 12 instead of 16,
       * TASK_SIZE needs to reflect the new size.
      
      This patch implements the above when the support for 52-bit VAs is
      detected at early boot time.
      
      On arm64 userspace addresses translation is controlled by TTBR0_EL1. As
      well as userspace, TTBR0_EL1 controls:
       * The identity mapping,
       * EFI runtime code.
      
      It is possible to run a kernel with an identity mapping that has a
      larger VA size than userspace (and for this case __cpu_set_tcr_t0sz()
      would set TCR_EL1.T0SZ as appropriate). However, when the conditions for
      52-bit userspace are met; it is possible to keep TCR_EL1.T0SZ fixed at
      12. Thus in this patch, the TCR_EL1.T0SZ size changing logic is
      disabled.
      Reviewed-by: NCatalin Marinas <catalin.marinas@arm.com>
      Signed-off-by: NSteve Capper <steve.capper@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      67e7fdfc
  28. 20 11月, 2018 1 次提交
    • A
      arm64: mm: apply r/o permissions of VM areas to its linear alias as well · c55191e9
      Ard Biesheuvel 提交于
      On arm64, we use block mappings and contiguous hints to map the linear
      region, to minimize the TLB footprint. However, this means that the
      entire region is mapped using read/write permissions, which we cannot
      modify at page granularity without having to take intrusive measures to
      prevent TLB conflicts.
      
      This means the linear aliases of pages belonging to read-only mappings
      (executable or otherwise) in the vmalloc region are also mapped read/write,
      and could potentially be abused to modify things like module code, bpf JIT
      code or other read-only data.
      
      So let's fix this, by extending the set_memory_ro/rw routines to take
      the linear alias into account. The consequence of enabling this is
      that we can no longer use block mappings or contiguous hints, so in
      cases where the TLB footprint of the linear region is a bottleneck,
      performance may be affected.
      
      Therefore, allow this feature to be runtime en/disabled, by setting
      rodata=full (or 'on' to disable just this enhancement, or 'off' to
      disable read-only mappings for code and r/o data entirely) on the
      kernel command line. Also, allow the default value to be set via a
      Kconfig option.
      Tested-by: NLaura Abbott <labbott@redhat.com>
      Signed-off-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      c55191e9
  29. 09 11月, 2018 1 次提交
  30. 31 10月, 2018 1 次提交
    • M
      memblock: rename memblock_alloc{_nid,_try_nid} to memblock_phys_alloc* · 9a8dd708
      Mike Rapoport 提交于
      Make it explicit that the caller gets a physical address rather than a
      virtual one.
      
      This will also allow using meblock_alloc prefix for memblock allocations
      returning virtual address, which is done in the following patches.
      
      The conversion is done using the following semantic patch:
      
      @@
      expression e1, e2, e3;
      @@
      (
      - memblock_alloc(e1, e2)
      + memblock_phys_alloc(e1, e2)
      |
      - memblock_alloc_nid(e1, e2, e3)
      + memblock_phys_alloc_nid(e1, e2, e3)
      |
      - memblock_alloc_try_nid(e1, e2, e3)
      + memblock_phys_alloc_try_nid(e1, e2, e3)
      )
      
      Link: http://lkml.kernel.org/r/1536927045-23536-7-git-send-email-rppt@linux.vnet.ibm.comSigned-off-by: NMike Rapoport <rppt@linux.vnet.ibm.com>
      Acked-by: NMichal Hocko <mhocko@suse.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Chris Zankel <chris@zankel.net>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: Greentime Hu <green.hu@gmail.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Guan Xuetao <gxt@pku.edu.cn>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: "James E.J. Bottomley" <jejb@parisc-linux.org>
      Cc: Jonas Bonn <jonas@southpole.se>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: Ley Foon Tan <lftan@altera.com>
      Cc: Mark Salter <msalter@redhat.com>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Matt Turner <mattst88@gmail.com>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Michal Simek <monstr@monstr.eu>
      Cc: Palmer Dabbelt <palmer@sifive.com>
      Cc: Paul Burton <paul.burton@mips.com>
      Cc: Richard Kuo <rkuo@codeaurora.org>
      Cc: Richard Weinberger <richard@nod.at>
      Cc: Rich Felker <dalias@libc.org>
      Cc: Russell King <linux@armlinux.org.uk>
      Cc: Serge Semin <fancer.lancer@gmail.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Vineet Gupta <vgupta@synopsys.com>
      Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      9a8dd708