1. 09 11月, 2018 1 次提交
  2. 31 10月, 2018 1 次提交
    • M
      memblock: rename memblock_alloc{_nid,_try_nid} to memblock_phys_alloc* · 9a8dd708
      Mike Rapoport 提交于
      Make it explicit that the caller gets a physical address rather than a
      virtual one.
      
      This will also allow using meblock_alloc prefix for memblock allocations
      returning virtual address, which is done in the following patches.
      
      The conversion is done using the following semantic patch:
      
      @@
      expression e1, e2, e3;
      @@
      (
      - memblock_alloc(e1, e2)
      + memblock_phys_alloc(e1, e2)
      |
      - memblock_alloc_nid(e1, e2, e3)
      + memblock_phys_alloc_nid(e1, e2, e3)
      |
      - memblock_alloc_try_nid(e1, e2, e3)
      + memblock_phys_alloc_try_nid(e1, e2, e3)
      )
      
      Link: http://lkml.kernel.org/r/1536927045-23536-7-git-send-email-rppt@linux.vnet.ibm.comSigned-off-by: NMike Rapoport <rppt@linux.vnet.ibm.com>
      Acked-by: NMichal Hocko <mhocko@suse.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Chris Zankel <chris@zankel.net>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: Greentime Hu <green.hu@gmail.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Guan Xuetao <gxt@pku.edu.cn>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: "James E.J. Bottomley" <jejb@parisc-linux.org>
      Cc: Jonas Bonn <jonas@southpole.se>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: Ley Foon Tan <lftan@altera.com>
      Cc: Mark Salter <msalter@redhat.com>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Matt Turner <mattst88@gmail.com>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Michal Simek <monstr@monstr.eu>
      Cc: Palmer Dabbelt <palmer@sifive.com>
      Cc: Paul Burton <paul.burton@mips.com>
      Cc: Richard Kuo <rkuo@codeaurora.org>
      Cc: Richard Weinberger <richard@nod.at>
      Cc: Rich Felker <dalias@libc.org>
      Cc: Russell King <linux@armlinux.org.uk>
      Cc: Serge Semin <fancer.lancer@gmail.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Vineet Gupta <vgupta@synopsys.com>
      Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      9a8dd708
  3. 11 10月, 2018 1 次提交
  4. 25 9月, 2018 2 次提交
    • J
      arm64/mm: use fixmap to modify swapper_pg_dir · 2330b7ca
      Jun Yao 提交于
      Once swapper_pg_dir is in the rodata section, it will not be possible to
      modify it directly, but we will need to modify it in some cases.
      
      To enable this, we can use the fixmap when deliberately modifying
      swapper_pg_dir. As the pgd is only transiently mapped, this provides
      some resilience against illicit modification of the pgd, e.g. for
      Kernel Space Mirror Attack (KSMA).
      Signed-off-by: NJun Yao <yaojun8558363@gmail.com>
      Reviewed-by: NJames Morse <james.morse@arm.com>
      [Mark: simplify ifdeffery, commit message]
      Signed-off-by: NMark Rutland <mark.rutland@arm.com>
      Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
      2330b7ca
    • J
      arm64/mm: Separate boot-time page tables from swapper_pg_dir · 2b5548b6
      Jun Yao 提交于
      Since the address of swapper_pg_dir is fixed for a given kernel image,
      it is an attractive target for manipulation via an arbitrary write. To
      mitigate this we'd like to make it read-only by moving it into the
      rodata section.
      
      We require that swapper_pg_dir is at a fixed offset from tramp_pg_dir
      and reserved_ttbr0, so these will also need to move into rodata.
      However, swapper_pg_dir is allocated along with some transient page
      tables used for boot which we do not want to move into rodata.
      
      As a step towards this, this patch separates the boot-time page tables
      into a new init_pg_dir, and reduces swapper_pg_dir to the single page it
      needs to be. This allows us to retain the relationship between
      swapper_pg_dir, tramp_pg_dir, and swapper_pg_dir, while cleanly
      separating these from the boot-time page tables.
      
      The init_pg_dir holds all of the pgd/pud/pmd/pte levels needed during
      boot, and all of these levels will be freed when we switch to the
      swapper_pg_dir, which is initialized by the existing code in
      paging_init(). Since we start off on the init_pg_dir, we no longer need
      to allocate a transient page table in paging_init() in order to ensure
      that swapper_pg_dir isn't live while we initialize it.
      
      There should be no functional change as a result of this patch.
      Signed-off-by: NJun Yao <yaojun8558363@gmail.com>
      Reviewed-by: NJames Morse <james.morse@arm.com>
      [Mark: place init_pg_dir after BSS, fold mm changes, commit message]
      Signed-off-by: NMark Rutland <mark.rutland@arm.com>
      Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
      2b5548b6
  5. 07 9月, 2018 1 次提交
    • M
      arm64: fix erroneous warnings in page freeing functions · fac880c7
      Mark Rutland 提交于
      In pmd_free_pte_page() and pud_free_pmd_page() we try to warn if they
      hit a present non-table entry. In both cases we'll warn for non-present
      entries, as the VM_WARN_ON() only checks the entry is not a table entry.
      
      This has been observed to result in warnings when booting a v4.19-rc2
      kernel under qemu.
      
      Fix this by bailing out earlier for non-present entries.
      
      Fixes: ec28bb9c ("arm64: Implement page table free interfaces")
      Signed-off-by: NMark Rutland <mark.rutland@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      fac880c7
  6. 06 7月, 2018 1 次提交
    • C
      arm64: Implement page table free interfaces · ec28bb9c
      Chintan Pandya 提交于
      arm64 requires break-before-make. Originally, before
      setting up new pmd/pud entry for huge mapping, in few
      cases, the modifying pmd/pud entry was still valid
      and pointing to next level page table as we only
      clear off leaf PTE in unmap leg.
      
       a) This was resulting into stale entry in TLBs (as few
          TLBs also cache intermediate mapping for performance
          reasons)
       b) Also, modifying pmd/pud was the only reference to
          next level page table and it was getting lost without
          freeing it. So, page leaks were happening.
      
      Implement pud_free_pmd_page() and pmd_free_pte_page() to
      enforce BBM and also free the leaking page tables.
      
      Implementation requires,
       1) Clearing off the current pud/pmd entry
       2) Invalidation of TLB
       3) Freeing of the un-used next level page tables
      Reviewed-by: NWill Deacon <will.deacon@arm.com>
      Signed-off-by: NChintan Pandya <cpandya@codeaurora.org>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      ec28bb9c
  7. 05 7月, 2018 1 次提交
    • C
      ioremap: Update pgtable free interfaces with addr · 785a19f9
      Chintan Pandya 提交于
      The following kernel panic was observed on ARM64 platform due to a stale
      TLB entry.
      
       1. ioremap with 4K size, a valid pte page table is set.
       2. iounmap it, its pte entry is set to 0.
       3. ioremap the same address with 2M size, update its pmd entry with
          a new value.
       4. CPU may hit an exception because the old pmd entry is still in TLB,
          which leads to a kernel panic.
      
      Commit b6bdb751 ("mm/vmalloc: add interfaces to free unmapped page
      table") has addressed this panic by falling to pte mappings in the above
      case on ARM64.
      
      To support pmd mappings in all cases, TLB purge needs to be performed
      in this case on ARM64.
      
      Add a new arg, 'addr', to pud_free_pmd_page() and pmd_free_pte_page()
      so that TLB purge can be added later in seprate patches.
      
      [toshi.kani@hpe.com: merge changes, rewrite patch description]
      Fixes: 28ee90fe ("x86/mm: implement free pmd/pte page interfaces")
      Signed-off-by: NChintan Pandya <cpandya@codeaurora.org>
      Signed-off-by: NToshi Kani <toshi.kani@hpe.com>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: mhocko@suse.com
      Cc: akpm@linux-foundation.org
      Cc: hpa@zytor.com
      Cc: linux-mm@kvack.org
      Cc: linux-arm-kernel@lists.infradead.org
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Joerg Roedel <joro@8bytes.org>
      Cc: stable@vger.kernel.org
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Michal Hocko <mhocko@suse.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: <stable@vger.kernel.org>
      Link: https://lkml.kernel.org/r/20180627141348.21777-3-toshi.kani@hpe.com
      785a19f9
  8. 24 5月, 2018 1 次提交
  9. 23 3月, 2018 1 次提交
    • T
      mm/vmalloc: add interfaces to free unmapped page table · b6bdb751
      Toshi Kani 提交于
      On architectures with CONFIG_HAVE_ARCH_HUGE_VMAP set, ioremap() may
      create pud/pmd mappings.  A kernel panic was observed on arm64 systems
      with Cortex-A75 in the following steps as described by Hanjun Guo.
      
       1. ioremap a 4K size, valid page table will build,
       2. iounmap it, pte0 will set to 0;
       3. ioremap the same address with 2M size, pgd/pmd is unchanged,
          then set the a new value for pmd;
       4. pte0 is leaked;
       5. CPU may meet exception because the old pmd is still in TLB,
          which will lead to kernel panic.
      
      This panic is not reproducible on x86.  INVLPG, called from iounmap,
      purges all levels of entries associated with purged address on x86.  x86
      still has memory leak.
      
      The patch changes the ioremap path to free unmapped page table(s) since
      doing so in the unmap path has the following issues:
      
       - The iounmap() path is shared with vunmap(). Since vmap() only
         supports pte mappings, making vunmap() to free a pte page is an
         overhead for regular vmap users as they do not need a pte page freed
         up.
      
       - Checking if all entries in a pte page are cleared in the unmap path
         is racy, and serializing this check is expensive.
      
       - The unmap path calls free_vmap_area_noflush() to do lazy TLB purges.
         Clearing a pud/pmd entry before the lazy TLB purges needs extra TLB
         purge.
      
      Add two interfaces, pud_free_pmd_page() and pmd_free_pte_page(), which
      clear a given pud/pmd entry and free up a page for the lower level
      entries.
      
      This patch implements their stub functions on x86 and arm64, which work
      as workaround.
      
      [akpm@linux-foundation.org: fix typo in pmd_free_pte_page() stub]
      Link: http://lkml.kernel.org/r/20180314180155.19492-2-toshi.kani@hpe.com
      Fixes: e61ce6ad ("mm: change ioremap to set up huge I/O mappings")
      Reported-by: NLei Li <lious.lilei@hisilicon.com>
      Signed-off-by: NToshi Kani <toshi.kani@hpe.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Wang Xuefeng <wxf.wang@hisilicon.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Hanjun Guo <guohanjun@huawei.com>
      Cc: Michal Hocko <mhocko@suse.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Matthew Wilcox <willy@infradead.org>
      Cc: Chintan Pandya <cpandya@codeaurora.org>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      b6bdb751
  10. 26 2月, 2018 1 次提交
  11. 22 2月, 2018 1 次提交
  12. 17 2月, 2018 1 次提交
    • W
      arm64: mm: Use READ_ONCE/WRITE_ONCE when accessing page tables · 20a004e7
      Will Deacon 提交于
      In many cases, page tables can be accessed concurrently by either another
      CPU (due to things like fast gup) or by the hardware page table walker
      itself, which may set access/dirty bits. In such cases, it is important
      to use READ_ONCE/WRITE_ONCE when accessing page table entries so that
      entries cannot be torn, merged or subject to apparent loss of coherence
      due to compiler transformations.
      
      Whilst there are some scenarios where this cannot happen (e.g. pinned
      kernel mappings for the linear region), the overhead of using READ_ONCE
      /WRITE_ONCE everywhere is minimal and makes the code an awful lot easier
      to reason about. This patch consistently uses these macros in the arch
      code, as well as explicitly namespacing pointers to page table entries
      from the entries themselves by using adopting a 'p' suffix for the former
      (as is sometimes used elsewhere in the kernel source).
      Tested-by: NYury Norov <ynorov@caviumnetworks.com>
      Tested-by: NRichard Ruigrok <rruigrok@codeaurora.org>
      Reviewed-by: NMarc Zyngier <marc.zyngier@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
      20a004e7
  13. 07 2月, 2018 1 次提交
  14. 15 1月, 2018 2 次提交
  15. 09 1月, 2018 2 次提交
  16. 23 12月, 2017 2 次提交
  17. 11 12月, 2017 2 次提交
  18. 07 11月, 2017 1 次提交
  19. 28 7月, 2017 1 次提交
  20. 30 5月, 2017 1 次提交
  21. 06 4月, 2017 1 次提交
    • T
      arm64: kdump: protect crash dump kernel memory · 98d2e153
      Takahiro Akashi 提交于
      arch_kexec_protect_crashkres() and arch_kexec_unprotect_crashkres()
      are meant to be called by kexec_load() in order to protect the memory
      allocated for crash dump kernel once the image is loaded.
      
      The protection is implemented by unmapping the relevant segments in crash
      dump kernel memory, rather than making it read-only as other archs do,
      to prevent coherency issues due to potential cache aliasing (with
      mismatched attributes).
      
      Page-level mappings are consistently used here so that we can change
      the attributes of segments in page granularity as well as shrink the region
      also in page granularity through /sys/kernel/kexec_crash_size, putting
      the freed memory back to buddy system.
      Signed-off-by: NAKASHI Takahiro <takahiro.akashi@linaro.org>
      Reviewed-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
      Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
      98d2e153
  22. 23 3月, 2017 10 次提交
  23. 24 2月, 2017 1 次提交
    • M
      Revert "arm64: mm: set the contiguous bit for kernel mappings where appropriate" · d81bbe6d
      Mark Rutland 提交于
      This reverts commit 0bfc445d.
      
      When we change the permissions of regions mapped using contiguous
      entries, the architecture requires us to follow a Break-Before-Make
      strategy, breaking *all* associated entries before we can change any of
      the following properties from the entries:
      
       - presence of the contiguous bit
       - output address
       - attributes
       - permissiones
      
      Failure to do so can result in a number of problems (e.g. TLB conflict
      aborts and/or erroneous results from TLB lookups).
      
      See ARM DDI 0487A.k_iss10775, "Misprogramming of the Contiguous bit",
      page D4-1762.
      
      We do not take this into account when altering the permissions of kernel
      segments in mark_rodata_ro(), where we change the permissions of live
      contiguous entires one-by-one, leaving them transiently inconsistent.
      This has been observed to result in failures on some fast model
      configurations.
      
      Unfortunately, we cannot follow Break-Before-Make here as we'd have to
      unmap kernel text and data used to perform the sequence.
      
      For the timebeing, revert commit 0bfc445d so as to avoid issues
      resulting from this misuse of the contiguous bit.
      Signed-off-by: NMark Rutland <mark.rutland@arm.com>
      Acked-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
      Reported-by: NJean-Philippe Brucker <jean-philippe.brucker@arm.com>
      Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Will Deacon <Will.Deacon@arm.com>
      Cc: stable@vger.kernel.org # v4.10
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      d81bbe6d
  24. 15 2月, 2017 1 次提交
  25. 13 1月, 2017 1 次提交
  26. 12 1月, 2017 1 次提交