1. 10 12月, 2019 1 次提交
    • I
      x86/mm/pat: Move the memtype related files to arch/x86/mm/pat/ · f9b57cf8
      Ingo Molnar 提交于
      - pat.c offers, dominantly, the memtype APIs - so rename it to memtype.c.
      
      - pageattr.c is offering, primarily, the set_memory*() page attribute APIs,
        which is offered via the <asm/set_memory.h> header: name the .c file
        along the same pattern.
      
      I.e. perform these renames, and move them all next to each other in arch/x86/mm/pat/:
      
          pat.c             => memtype.c
          pat_internal.h    => memtype.h
          pat_interval.c    => memtype_interval.c
      
          pageattr.c        => set_memory.c
          pageattr-test.c   => cpa-test.c
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      f9b57cf8
  2. 12 11月, 2019 1 次提交
  3. 03 9月, 2019 4 次提交
  4. 30 8月, 2019 1 次提交
    • T
      x86/mm/cpa: Prevent large page split when ftrace flips RW on kernel text · 7af01450
      Thomas Gleixner 提交于
      ftrace does not use text_poke() for enabling trace functionality. It uses
      its own mechanism and flips the whole kernel text to RW and back to RO.
      
      The CPA rework removed a loop based check of 4k pages which tried to
      preserve a large page by checking each 4k page whether the change would
      actually cover all pages in the large page.
      
      This resulted in endless loops for nothing as in testing it turned out that
      it actually never preserved anything. Of course testing missed to include
      ftrace, which is the one and only case which benefitted from the 4k loop.
      
      As a consequence enabling function tracing or ftrace based kprobes results
      in a full 4k split of the kernel text, which affects iTLB performance.
      
      The kernel RO protection is the only valid case where this can actually
      preserve large pages.
      
      All other static protections (RO data, data NX, PCI, BIOS) are truly
      static.  So a conflict with those protections which results in a split
      should only ever happen when a change of memory next to a protected region
      is attempted. But these conflicts are rightfully splitting the large page
      to preserve the protected regions. In fact a change to the protected
      regions itself is a bug and is warned about.
      
      Add an exception for the static protection check for kernel text RO when
      the to be changed region spawns a full large page which allows to preserve
      the large mappings. This also prevents the syslog to be spammed about CPA
      violations when ftrace is used.
      
      The exception needs to be removed once ftrace switched over to text_poke()
      which avoids the whole issue.
      
      Fixes: 585948f4 ("x86/mm/cpa: Avoid the 4k pages check completely")
      Reported-by: NSong Liu <songliubraving@fb.com>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Tested-by: NSong Liu <songliubraving@fb.com>
      Reviewed-by: NSong Liu <songliubraving@fb.com>
      Acked-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: stable@vger.kernel.org
      Link: https://lkml.kernel.org/r/alpine.DEB.2.21.1908282355340.1938@nanos.tec.linutronix.de
      7af01450
  5. 21 5月, 2019 1 次提交
  6. 30 4月, 2019 2 次提交
    • R
      mm/hibernation: Make hibernation handle unmapped pages · d6332692
      Rick Edgecombe 提交于
      Make hibernate handle unmapped pages on the direct map when
      CONFIG_ARCH_HAS_SET_ALIAS=y is set. These functions allow for setting pages
      to invalid configurations, so now hibernate should check if the pages have
      valid mappings and handle if they are unmapped when doing a hibernate
      save operation.
      
      Previously this checking was already done when CONFIG_DEBUG_PAGEALLOC=y
      was configured. It does not appear to have a big hibernating performance
      impact. The speed of the saving operation before this change was measured
      as 819.02 MB/s, and after was measured at 813.32 MB/s.
      
      Before:
      [    4.670938] PM: Wrote 171996 kbytes in 0.21 seconds (819.02 MB/s)
      
      After:
      [    4.504714] PM: Wrote 178932 kbytes in 0.22 seconds (813.32 MB/s)
      Signed-off-by: NRick Edgecombe <rick.p.edgecombe@intel.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Acked-by: NPavel Machek <pavel@ucw.cz>
      Cc: <akpm@linux-foundation.org>
      Cc: <ard.biesheuvel@linaro.org>
      Cc: <deneen.t.dock@intel.com>
      Cc: <kernel-hardening@lists.openwall.com>
      Cc: <kristen@linux.intel.com>
      Cc: <linux_dti@icloud.com>
      Cc: <will.deacon@arm.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Nadav Amit <nadav.amit@gmail.com>
      Cc: Rafael J. Wysocki <rjw@rjwysocki.net>
      Cc: Rik van Riel <riel@surriel.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: https://lkml.kernel.org/r/20190426001143.4983-16-namit@vmware.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      d6332692
    • R
      x86/mm/cpa: Add set_direct_map_*() functions · d253ca0c
      Rick Edgecombe 提交于
      Add two new functions set_direct_map_default_noflush() and
      set_direct_map_invalid_noflush() for setting the direct map alias for the
      page to its default valid permissions and to an invalid state that cannot
      be cached in a TLB, respectively. These functions do not flush the TLB.
      
      Note, __kernel_map_pages() does something similar but flushes the TLB and
      doesn't reset the permission bits to default on all architectures.
      
      Also add an ARCH config ARCH_HAS_SET_DIRECT_MAP for specifying whether
      these have an actual implementation or a default empty one.
      Signed-off-by: NRick Edgecombe <rick.p.edgecombe@intel.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: <akpm@linux-foundation.org>
      Cc: <ard.biesheuvel@linaro.org>
      Cc: <deneen.t.dock@intel.com>
      Cc: <kernel-hardening@lists.openwall.com>
      Cc: <kristen@linux.intel.com>
      Cc: <linux_dti@icloud.com>
      Cc: <will.deacon@arm.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Nadav Amit <nadav.amit@gmail.com>
      Cc: Rik van Riel <riel@surriel.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: https://lkml.kernel.org/r/20190426001143.4983-15-namit@vmware.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      d253ca0c
  7. 07 3月, 2019 1 次提交
    • Q
      x86/mm: Remove unused variable 'old_pte' · 24c41220
      Qian Cai 提交于
      The commit 3a19109e ("x86/mm: Fix try_preserve_large_page() to
      handle large PAT bit") fixed try_preserve_large_page() by using the
      corresponding pud/pmd prot/pfn interfaces, but left a variable unused
      because it no longer used pte_pfn().
      
      Later, the commit 8679de09 ("x86/mm/cpa: Split, rename and clean up
      try_preserve_large_page()") renamed try_preserve_large_page() to
      __should_split_large_page(), but the unused variable remains.
      
      arch/x86/mm/pageattr.c: In function '__should_split_large_page':
      arch/x86/mm/pageattr.c:741:17: warning: variable 'old_pte' set but not
      used [-Wunused-but-set-variable]
      
      Fixes: 3a19109e ("x86/mm: Fix try_preserve_large_page() to handle large PAT bit")
      Signed-off-by: NQian Cai <cai@lca.pw>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: dave.hansen@linux.intel.com
      Cc: luto@kernel.org
      Cc: peterz@infradead.org
      Cc: toshi.kani@hpe.com
      Cc: bp@alien8.de
      Cc: hpa@zytor.com
      Link: https://lkml.kernel.org/r/20190301152924.94762-1-cai@lca.pw
      24c41220
  8. 08 2月, 2019 1 次提交
  9. 18 12月, 2018 9 次提交
  10. 03 12月, 2018 1 次提交
    • I
      x86: Fix various typos in comments · a97673a1
      Ingo Molnar 提交于
      Go over arch/x86/ and fix common typos in comments,
      and a typo in an actual function argument name.
      
      No change in functionality intended.
      
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      a97673a1
  11. 30 11月, 2018 1 次提交
    • S
      x86/mm/pageattr: Introduce helper function to unmap EFI boot services · 7e0dabd3
      Sai Praneeth Prakhya 提交于
      Ideally, after kernel assumes control of the platform, firmware
      shouldn't access EFI boot services code/data regions. But, it's noticed
      that this is not so true in many x86 platforms. Hence, during boot,
      kernel reserves EFI boot services code/data regions [1] and maps [2]
      them to efi_pgd so that call to set_virtual_address_map() doesn't fail.
      After returning from set_virtual_address_map(), kernel frees the
      reserved regions [3] but they still remain mapped. Hence, introduce
      kernel_unmap_pages_in_pgd() which will later be used to unmap EFI boot
      services code/data regions.
      
      While at it modify kernel_map_pages_in_pgd() by:
      
      1. Adding __init modifier because it's always used *only* during boot.
      2. Add a warning if it's used after SMP is initialized because it uses
         __flush_tlb_all() which flushes mappings only on current CPU.
      
      Unmapping EFI boot services code/data regions will result in clearing
      PAGE_PRESENT bit and it shouldn't bother L1TF cases because it's already
      handled by protnone_mask() at arch/x86/include/asm/pgtable-invert.h.
      
      [1] efi_reserve_boot_services()
      [2] efi_map_region() -> __map_region() -> kernel_map_pages_in_pgd()
      [3] efi_free_boot_services()
      Signed-off-by: NSai Praneeth Prakhya <sai.praneeth.prakhya@intel.com>
      Signed-off-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
      Reviewed-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Arend van Spriel <arend.vanspriel@broadcom.com>
      Cc: Bhupesh Sharma <bhsharma@redhat.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Eric Snowberg <eric.snowberg@oracle.com>
      Cc: Hans de Goede <hdegoede@redhat.com>
      Cc: Joe Perches <joe@perches.com>
      Cc: Jon Hunter <jonathanh@nvidia.com>
      Cc: Julien Thierry <julien.thierry@arm.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Marc Zyngier <marc.zyngier@arm.com>
      Cc: Matt Fleming <matt@codeblueprint.co.uk>
      Cc: Nathan Chancellor <natechancellor@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Sedat Dilek <sedat.dilek@gmail.com>
      Cc: YiFei Zhu <zhuyifei1999@gmail.com>
      Cc: linux-efi@vger.kernel.org
      Link: http://lkml.kernel.org/r/20181129171230.18699-5-ard.biesheuvel@linaro.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      7e0dabd3
  12. 31 10月, 2018 1 次提交
  13. 30 10月, 2018 1 次提交
    • S
      x86/mm/pat: Disable preemption around __flush_tlb_all() · f77084d9
      Sebastian Andrzej Siewior 提交于
      The WARN_ON_ONCE(__read_cr3() != build_cr3()) in switch_mm_irqs_off()
      triggers every once in a while during a snapshotted system upgrade.
      
      The warning triggers since commit decab088 ("x86/mm: Remove
      preempt_disable/enable() from __native_flush_tlb()"). The callchain is:
      
        get_page_from_freelist() -> post_alloc_hook() -> __kernel_map_pages()
      
      with CONFIG_DEBUG_PAGEALLOC enabled.
      
      Disable preemption during CR3 reset / __flush_tlb_all() and add a comment
      why preemption has to be disabled so it won't be removed accidentaly.
      
      Add another preemptible() check in __flush_tlb_all() to catch callers with
      enabled preemption when PGE is enabled, because PGE enabled does not
      trigger the warning in __native_flush_tlb(). Suggested by Andy Lutomirski.
      
      Fixes: decab088 ("x86/mm: Remove preempt_disable/enable() from __native_flush_tlb()")
      Signed-off-by: NSebastian Andrzej Siewior <bigeasy@linutronix.de>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: stable@vger.kernel.org
      Link: https://lkml.kernel.org/r/20181017103432.zgv46nlu3hc7k4rq@linutronix.de
      f77084d9
  14. 28 9月, 2018 15 次提交