1. 28 3月, 2018 1 次提交
  2. 25 3月, 2018 1 次提交
  3. 23 3月, 2018 2 次提交
    • T
      x86/mm: implement free pmd/pte page interfaces · 28ee90fe
      Toshi Kani 提交于
      Implement pud_free_pmd_page() and pmd_free_pte_page() on x86, which
      clear a given pud/pmd entry and free up lower level page table(s).
      
      The address range associated with the pud/pmd entry must have been
      purged by INVLPG.
      
      Link: http://lkml.kernel.org/r/20180314180155.19492-3-toshi.kani@hpe.com
      Fixes: e61ce6ad ("mm: change ioremap to set up huge I/O mappings")
      Signed-off-by: NToshi Kani <toshi.kani@hpe.com>
      Reported-by: NLei Li <lious.lilei@hisilicon.com>
      Cc: Michal Hocko <mhocko@suse.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Matthew Wilcox <willy@infradead.org>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      28ee90fe
    • T
      mm/vmalloc: add interfaces to free unmapped page table · b6bdb751
      Toshi Kani 提交于
      On architectures with CONFIG_HAVE_ARCH_HUGE_VMAP set, ioremap() may
      create pud/pmd mappings.  A kernel panic was observed on arm64 systems
      with Cortex-A75 in the following steps as described by Hanjun Guo.
      
       1. ioremap a 4K size, valid page table will build,
       2. iounmap it, pte0 will set to 0;
       3. ioremap the same address with 2M size, pgd/pmd is unchanged,
          then set the a new value for pmd;
       4. pte0 is leaked;
       5. CPU may meet exception because the old pmd is still in TLB,
          which will lead to kernel panic.
      
      This panic is not reproducible on x86.  INVLPG, called from iounmap,
      purges all levels of entries associated with purged address on x86.  x86
      still has memory leak.
      
      The patch changes the ioremap path to free unmapped page table(s) since
      doing so in the unmap path has the following issues:
      
       - The iounmap() path is shared with vunmap(). Since vmap() only
         supports pte mappings, making vunmap() to free a pte page is an
         overhead for regular vmap users as they do not need a pte page freed
         up.
      
       - Checking if all entries in a pte page are cleared in the unmap path
         is racy, and serializing this check is expensive.
      
       - The unmap path calls free_vmap_area_noflush() to do lazy TLB purges.
         Clearing a pud/pmd entry before the lazy TLB purges needs extra TLB
         purge.
      
      Add two interfaces, pud_free_pmd_page() and pmd_free_pte_page(), which
      clear a given pud/pmd entry and free up a page for the lower level
      entries.
      
      This patch implements their stub functions on x86 and arm64, which work
      as workaround.
      
      [akpm@linux-foundation.org: fix typo in pmd_free_pte_page() stub]
      Link: http://lkml.kernel.org/r/20180314180155.19492-2-toshi.kani@hpe.com
      Fixes: e61ce6ad ("mm: change ioremap to set up huge I/O mappings")
      Reported-by: NLei Li <lious.lilei@hisilicon.com>
      Signed-off-by: NToshi Kani <toshi.kani@hpe.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Wang Xuefeng <wxf.wang@hisilicon.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Hanjun Guo <guohanjun@huawei.com>
      Cc: Michal Hocko <mhocko@suse.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Matthew Wilcox <willy@infradead.org>
      Cc: Chintan Pandya <cpandya@codeaurora.org>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      b6bdb751
  4. 21 3月, 2018 1 次提交
    • L
      kvm/x86: fix icebp instruction handling · 32d43cd3
      Linus Torvalds 提交于
      The undocumented 'icebp' instruction (aka 'int1') works pretty much like
      'int3' in the absense of in-circuit probing equipment (except,
      obviously, that it raises #DB instead of raising #BP), and is used by
      some validation test-suites as such.
      
      But Andy Lutomirski noticed that his test suite acted differently in kvm
      than on bare hardware.
      
      The reason is that kvm used an inexact test for the icebp instruction:
      it just assumed that an all-zero VM exit qualification value meant that
      the VM exit was due to icebp.
      
      That is not unlike the guess that do_debug() does for the actual
      exception handling case, but it's purely a heuristic, not an absolute
      rule.  do_debug() does it because it wants to ascribe _some_ reasons to
      the #DB that happened, and an empty %dr6 value means that 'icebp' is the
      most likely casue and we have no better information.
      
      But kvm can just do it right, because unlike the do_debug() case, kvm
      actually sees the real reason for the #DB in the VM-exit interruption
      information field.
      
      So instead of relying on an inexact heuristic, just use the actual VM
      exit information that says "it was 'icebp'".
      
      Right now the 'icebp' instruction isn't technically documented by Intel,
      but that will hopefully change.  The special "privileged software
      exception" information _is_ actually mentioned in the Intel SDM, even
      though the cause of it isn't enumerated.
      Reported-by: NAndy Lutomirski <luto@kernel.org>
      Tested-by: NPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      32d43cd3
  5. 17 3月, 2018 2 次提交
  6. 16 3月, 2018 2 次提交
  7. 15 3月, 2018 2 次提交
    • D
      x86, memremap: fix altmap accounting at free · a7e6c701
      Dan Williams 提交于
      Commit 24b6d416 "mm: pass the vmem_altmap to vmemmap_free" converted
      the vmemmap_free() path to pass the altmap argument all the way through
      the call chain rather than looking it up based on the page.
      Unfortunately that ends up over freeing altmap allocated pages in some
      cases since free_pagetable() is used to free both memmap space and pte
      space, where only the memmap stored in huge pages uses altmap
      allocations.
      
      Given that altmap allocations for memmap space are special cased in
      vmemmap_populate_hugepages() add a symmetric / special case
      free_hugepage_table() to handle altmap freeing, and cleanup the unneeded
      passing of altmap to leaf functions that do not require it.
      
      Without this change the sanity check accounting in
      devm_memremap_pages_release() will throw a warning with the following
      signature.
      
       nd_pmem pfn10.1: devm_memremap_pages_release: failed to free all reserved pages
       WARNING: CPU: 44 PID: 3539 at kernel/memremap.c:310 devm_memremap_pages_release+0x1c7/0x220
       CPU: 44 PID: 3539 Comm: ndctl Tainted: G             L   4.16.0-rc1-linux-stable #7
       RIP: 0010:devm_memremap_pages_release+0x1c7/0x220
       [..]
       Call Trace:
        release_nodes+0x225/0x270
        device_release_driver_internal+0x15d/0x210
        bus_remove_device+0xe2/0x160
        device_del+0x130/0x310
        ? klist_release+0x56/0x100
        ? nd_region_notify+0xc0/0xc0 [libnvdimm]
        device_unregister+0x16/0x60
      
      This was missed in testing since not all configurations will trigger
      this warning.
      
      Fixes: 24b6d416 ("mm: pass the vmem_altmap to vmemmap_free")
      Reported-by: NJane Chu <jane.chu@oracle.com>
      Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NDan Williams <dan.j.williams@intel.com>
      a7e6c701
    • T
      x86/mm: Fix vmalloc_fault to use pXd_large · 18a95521
      Toshi Kani 提交于
      Gratian Crisan reported that vmalloc_fault() crashes when CONFIG_HUGETLBFS
      is not set since the function inadvertently uses pXn_huge(), which always
      return 0 in this case.  ioremap() does not depend on CONFIG_HUGETLBFS.
      
      Fix vmalloc_fault() to call pXd_large() instead.
      
      Fixes: f4eafd8b ("x86/mm: Fix vmalloc_fault() to handle large pages properly")
      Reported-by: NGratian Crisan <gratian.crisan@ni.com>
      Signed-off-by: NToshi Kani <toshi.kani@hpe.com>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: stable@vger.kernel.org
      Cc: linux-mm@kvack.org
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Andy Lutomirski <luto@kernel.org>
      Link: https://lkml.kernel.org/r/20180313170347.3829-2-toshi.kani@hpe.com
      18a95521
  8. 14 3月, 2018 2 次提交
  9. 12 3月, 2018 2 次提交
  10. 09 3月, 2018 1 次提交
    • F
      x86/kprobes: Fix kernel crash when probing .entry_trampoline code · c07a8f8b
      Francis Deslauriers 提交于
      Disable the kprobe probing of the entry trampoline:
      
      .entry_trampoline is a code area that is used to ensure page table
      isolation between userspace and kernelspace.
      
      At the beginning of the execution of the trampoline, we load the
      kernel's CR3 register. This has the effect of enabling the translation
      of the kernel virtual addresses to physical addresses. Before this
      happens most kernel addresses can not be translated because the running
      process' CR3 is still used.
      
      If a kprobe is placed on the trampoline code before that change of the
      CR3 register happens the kernel crashes because int3 handling pages are
      not accessible.
      
      To fix this, add the .entry_trampoline section to the kprobe blacklist
      to prohibit the probing of code before all the kernel pages are
      accessible.
      Signed-off-by: NFrancis Deslauriers <francis.deslauriers@efficios.com>
      Reviewed-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: mathieu.desnoyers@efficios.com
      Cc: mhiramat@kernel.org
      Link: http://lkml.kernel.org/r/1520565492-4637-2-git-send-email-francis.deslauriers@efficios.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      c07a8f8b
  11. 08 3月, 2018 13 次提交
  12. 07 3月, 2018 6 次提交
  13. 06 3月, 2018 1 次提交
  14. 04 3月, 2018 1 次提交
  15. 02 3月, 2018 3 次提交