1. 16 12月, 2020 1 次提交
    • M
      arch, mm: restore dependency of __kernel_map_pages() on DEBUG_PAGEALLOC · 5d6ad668
      Mike Rapoport 提交于
      The design of DEBUG_PAGEALLOC presumes that __kernel_map_pages() must
      never fail.  With this assumption is wouldn't be safe to allow general
      usage of this function.
      
      Moreover, some architectures that implement __kernel_map_pages() have this
      function guarded by #ifdef DEBUG_PAGEALLOC and some refuse to map/unmap
      pages when page allocation debugging is disabled at runtime.
      
      As all the users of __kernel_map_pages() were converted to use
      debug_pagealloc_map_pages() it is safe to make it available only when
      DEBUG_PAGEALLOC is set.
      
      Link: https://lkml.kernel.org/r/20201109192128.960-4-rppt@kernel.orgSigned-off-by: NMike Rapoport <rppt@linux.ibm.com>
      Acked-by: NDavid Hildenbrand <david@redhat.com>
      Acked-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Cc: Albert Ou <aou@eecs.berkeley.edu>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Christian Borntraeger <borntraeger@de.ibm.com>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: "Edgecombe, Rick P" <rick.p.edgecombe@intel.com>
      Cc: Heiko Carstens <hca@linux.ibm.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Cc: Len Brown <len.brown@intel.com>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Palmer Dabbelt <palmer@dabbelt.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Paul Walmsley <paul.walmsley@sifive.com>
      Cc: Pavel Machek <pavel@ucw.cz>
      Cc: Pekka Enberg <penberg@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
      Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vasily Gorbik <gor@linux.ibm.com>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Cc: Will Deacon <will@kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      5d6ad668
  2. 18 9月, 2020 1 次提交
  3. 30 6月, 2020 1 次提交
    • Q
      x86/mm/pat: Mark an intentional data race · cb38f820
      Qian Cai 提交于
      cpa_4k_install could be accessed concurrently as noticed by KCSAN,
      
      read to 0xffffffffaa59a000 of 8 bytes by interrupt on cpu 7:
      cpa_inc_4k_install arch/x86/mm/pat/set_memory.c:131 [inline]
      __change_page_attr+0x10cf/0x1840 arch/x86/mm/pat/set_memory.c:1514
      __change_page_attr_set_clr+0xce/0x490 arch/x86/mm/pat/set_memory.c:1636
      __set_pages_np+0xc4/0xf0 arch/x86/mm/pat/set_memory.c:2148
      __kernel_map_pages+0xb0/0xc8 arch/x86/mm/pat/set_memory.c:2178
      kernel_map_pages include/linux/mm.h:2719 [inline] <snip>
      
      write to 0xffffffffaa59a000 of 8 bytes by task 1 on cpu 6:
      cpa_inc_4k_install arch/x86/mm/pat/set_memory.c:131 [inline]
      __change_page_attr+0x10ea/0x1840 arch/x86/mm/pat/set_memory.c:1514
      __change_page_attr_set_clr+0xce/0x490 arch/x86/mm/pat/set_memory.c:1636
      __set_pages_p+0xc4/0xf0 arch/x86/mm/pat/set_memory.c:2129
      __kernel_map_pages+0x2e/0xc8 arch/x86/mm/pat/set_memory.c:2176
      kernel_map_pages include/linux/mm.h:2719 [inline] <snip>
      
      Both accesses are due to the same "cpa_4k_install++" in
      cpa_inc_4k_install. A data race here could be potentially undesirable:
      depending on compiler optimizations or how x86 executes a non-LOCK'd
      increment, it may lose increments, corrupt the counter, etc. Since this
      counter only seems to be used for printing some stats, this data race
      itself is unlikely to cause harm to the system though. Thus, mark this
      intentional data race using the data_race() marco.
      Suggested-by: NMacro Elver <elver@google.com>
      Signed-off-by: NQian Cai <cai@lca.pw>
      Acked-by: NBorislav Petkov <bp@suse.de>
      Signed-off-by: NPaul E. McKenney <paulmck@kernel.org>
      cb38f820
  4. 01 5月, 2020 1 次提交
  5. 26 4月, 2020 1 次提交
  6. 23 4月, 2020 1 次提交
  7. 11 4月, 2020 1 次提交
  8. 27 3月, 2020 1 次提交
  9. 20 1月, 2020 1 次提交
  10. 10 12月, 2019 4 次提交
  11. 12 11月, 2019 1 次提交
  12. 03 9月, 2019 4 次提交
  13. 30 8月, 2019 1 次提交
    • T
      x86/mm/cpa: Prevent large page split when ftrace flips RW on kernel text · 7af01450
      Thomas Gleixner 提交于
      ftrace does not use text_poke() for enabling trace functionality. It uses
      its own mechanism and flips the whole kernel text to RW and back to RO.
      
      The CPA rework removed a loop based check of 4k pages which tried to
      preserve a large page by checking each 4k page whether the change would
      actually cover all pages in the large page.
      
      This resulted in endless loops for nothing as in testing it turned out that
      it actually never preserved anything. Of course testing missed to include
      ftrace, which is the one and only case which benefitted from the 4k loop.
      
      As a consequence enabling function tracing or ftrace based kprobes results
      in a full 4k split of the kernel text, which affects iTLB performance.
      
      The kernel RO protection is the only valid case where this can actually
      preserve large pages.
      
      All other static protections (RO data, data NX, PCI, BIOS) are truly
      static.  So a conflict with those protections which results in a split
      should only ever happen when a change of memory next to a protected region
      is attempted. But these conflicts are rightfully splitting the large page
      to preserve the protected regions. In fact a change to the protected
      regions itself is a bug and is warned about.
      
      Add an exception for the static protection check for kernel text RO when
      the to be changed region spawns a full large page which allows to preserve
      the large mappings. This also prevents the syslog to be spammed about CPA
      violations when ftrace is used.
      
      The exception needs to be removed once ftrace switched over to text_poke()
      which avoids the whole issue.
      
      Fixes: 585948f4 ("x86/mm/cpa: Avoid the 4k pages check completely")
      Reported-by: NSong Liu <songliubraving@fb.com>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Tested-by: NSong Liu <songliubraving@fb.com>
      Reviewed-by: NSong Liu <songliubraving@fb.com>
      Acked-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: stable@vger.kernel.org
      Link: https://lkml.kernel.org/r/alpine.DEB.2.21.1908282355340.1938@nanos.tec.linutronix.de
      7af01450
  14. 21 5月, 2019 1 次提交
  15. 30 4月, 2019 2 次提交
    • R
      mm/hibernation: Make hibernation handle unmapped pages · d6332692
      Rick Edgecombe 提交于
      Make hibernate handle unmapped pages on the direct map when
      CONFIG_ARCH_HAS_SET_ALIAS=y is set. These functions allow for setting pages
      to invalid configurations, so now hibernate should check if the pages have
      valid mappings and handle if they are unmapped when doing a hibernate
      save operation.
      
      Previously this checking was already done when CONFIG_DEBUG_PAGEALLOC=y
      was configured. It does not appear to have a big hibernating performance
      impact. The speed of the saving operation before this change was measured
      as 819.02 MB/s, and after was measured at 813.32 MB/s.
      
      Before:
      [    4.670938] PM: Wrote 171996 kbytes in 0.21 seconds (819.02 MB/s)
      
      After:
      [    4.504714] PM: Wrote 178932 kbytes in 0.22 seconds (813.32 MB/s)
      Signed-off-by: NRick Edgecombe <rick.p.edgecombe@intel.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Acked-by: NPavel Machek <pavel@ucw.cz>
      Cc: <akpm@linux-foundation.org>
      Cc: <ard.biesheuvel@linaro.org>
      Cc: <deneen.t.dock@intel.com>
      Cc: <kernel-hardening@lists.openwall.com>
      Cc: <kristen@linux.intel.com>
      Cc: <linux_dti@icloud.com>
      Cc: <will.deacon@arm.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Nadav Amit <nadav.amit@gmail.com>
      Cc: Rafael J. Wysocki <rjw@rjwysocki.net>
      Cc: Rik van Riel <riel@surriel.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: https://lkml.kernel.org/r/20190426001143.4983-16-namit@vmware.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      d6332692
    • R
      x86/mm/cpa: Add set_direct_map_*() functions · d253ca0c
      Rick Edgecombe 提交于
      Add two new functions set_direct_map_default_noflush() and
      set_direct_map_invalid_noflush() for setting the direct map alias for the
      page to its default valid permissions and to an invalid state that cannot
      be cached in a TLB, respectively. These functions do not flush the TLB.
      
      Note, __kernel_map_pages() does something similar but flushes the TLB and
      doesn't reset the permission bits to default on all architectures.
      
      Also add an ARCH config ARCH_HAS_SET_DIRECT_MAP for specifying whether
      these have an actual implementation or a default empty one.
      Signed-off-by: NRick Edgecombe <rick.p.edgecombe@intel.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: <akpm@linux-foundation.org>
      Cc: <ard.biesheuvel@linaro.org>
      Cc: <deneen.t.dock@intel.com>
      Cc: <kernel-hardening@lists.openwall.com>
      Cc: <kristen@linux.intel.com>
      Cc: <linux_dti@icloud.com>
      Cc: <will.deacon@arm.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Nadav Amit <nadav.amit@gmail.com>
      Cc: Rik van Riel <riel@surriel.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: https://lkml.kernel.org/r/20190426001143.4983-15-namit@vmware.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      d253ca0c
  16. 07 3月, 2019 1 次提交
    • Q
      x86/mm: Remove unused variable 'old_pte' · 24c41220
      Qian Cai 提交于
      The commit 3a19109e ("x86/mm: Fix try_preserve_large_page() to
      handle large PAT bit") fixed try_preserve_large_page() by using the
      corresponding pud/pmd prot/pfn interfaces, but left a variable unused
      because it no longer used pte_pfn().
      
      Later, the commit 8679de09 ("x86/mm/cpa: Split, rename and clean up
      try_preserve_large_page()") renamed try_preserve_large_page() to
      __should_split_large_page(), but the unused variable remains.
      
      arch/x86/mm/pageattr.c: In function '__should_split_large_page':
      arch/x86/mm/pageattr.c:741:17: warning: variable 'old_pte' set but not
      used [-Wunused-but-set-variable]
      
      Fixes: 3a19109e ("x86/mm: Fix try_preserve_large_page() to handle large PAT bit")
      Signed-off-by: NQian Cai <cai@lca.pw>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: dave.hansen@linux.intel.com
      Cc: luto@kernel.org
      Cc: peterz@infradead.org
      Cc: toshi.kani@hpe.com
      Cc: bp@alien8.de
      Cc: hpa@zytor.com
      Link: https://lkml.kernel.org/r/20190301152924.94762-1-cai@lca.pw
      24c41220
  17. 08 2月, 2019 1 次提交
  18. 18 12月, 2018 9 次提交
  19. 03 12月, 2018 1 次提交
    • I
      x86: Fix various typos in comments · a97673a1
      Ingo Molnar 提交于
      Go over arch/x86/ and fix common typos in comments,
      and a typo in an actual function argument name.
      
      No change in functionality intended.
      
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      a97673a1
  20. 30 11月, 2018 1 次提交
    • S
      x86/mm/pageattr: Introduce helper function to unmap EFI boot services · 7e0dabd3
      Sai Praneeth Prakhya 提交于
      Ideally, after kernel assumes control of the platform, firmware
      shouldn't access EFI boot services code/data regions. But, it's noticed
      that this is not so true in many x86 platforms. Hence, during boot,
      kernel reserves EFI boot services code/data regions [1] and maps [2]
      them to efi_pgd so that call to set_virtual_address_map() doesn't fail.
      After returning from set_virtual_address_map(), kernel frees the
      reserved regions [3] but they still remain mapped. Hence, introduce
      kernel_unmap_pages_in_pgd() which will later be used to unmap EFI boot
      services code/data regions.
      
      While at it modify kernel_map_pages_in_pgd() by:
      
      1. Adding __init modifier because it's always used *only* during boot.
      2. Add a warning if it's used after SMP is initialized because it uses
         __flush_tlb_all() which flushes mappings only on current CPU.
      
      Unmapping EFI boot services code/data regions will result in clearing
      PAGE_PRESENT bit and it shouldn't bother L1TF cases because it's already
      handled by protnone_mask() at arch/x86/include/asm/pgtable-invert.h.
      
      [1] efi_reserve_boot_services()
      [2] efi_map_region() -> __map_region() -> kernel_map_pages_in_pgd()
      [3] efi_free_boot_services()
      Signed-off-by: NSai Praneeth Prakhya <sai.praneeth.prakhya@intel.com>
      Signed-off-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
      Reviewed-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Arend van Spriel <arend.vanspriel@broadcom.com>
      Cc: Bhupesh Sharma <bhsharma@redhat.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Eric Snowberg <eric.snowberg@oracle.com>
      Cc: Hans de Goede <hdegoede@redhat.com>
      Cc: Joe Perches <joe@perches.com>
      Cc: Jon Hunter <jonathanh@nvidia.com>
      Cc: Julien Thierry <julien.thierry@arm.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Marc Zyngier <marc.zyngier@arm.com>
      Cc: Matt Fleming <matt@codeblueprint.co.uk>
      Cc: Nathan Chancellor <natechancellor@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Sedat Dilek <sedat.dilek@gmail.com>
      Cc: YiFei Zhu <zhuyifei1999@gmail.com>
      Cc: linux-efi@vger.kernel.org
      Link: http://lkml.kernel.org/r/20181129171230.18699-5-ard.biesheuvel@linaro.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      7e0dabd3
  21. 31 10月, 2018 1 次提交
  22. 30 10月, 2018 1 次提交
    • S
      x86/mm/pat: Disable preemption around __flush_tlb_all() · f77084d9
      Sebastian Andrzej Siewior 提交于
      The WARN_ON_ONCE(__read_cr3() != build_cr3()) in switch_mm_irqs_off()
      triggers every once in a while during a snapshotted system upgrade.
      
      The warning triggers since commit decab088 ("x86/mm: Remove
      preempt_disable/enable() from __native_flush_tlb()"). The callchain is:
      
        get_page_from_freelist() -> post_alloc_hook() -> __kernel_map_pages()
      
      with CONFIG_DEBUG_PAGEALLOC enabled.
      
      Disable preemption during CR3 reset / __flush_tlb_all() and add a comment
      why preemption has to be disabled so it won't be removed accidentaly.
      
      Add another preemptible() check in __flush_tlb_all() to catch callers with
      enabled preemption when PGE is enabled, because PGE enabled does not
      trigger the warning in __native_flush_tlb(). Suggested by Andy Lutomirski.
      
      Fixes: decab088 ("x86/mm: Remove preempt_disable/enable() from __native_flush_tlb()")
      Signed-off-by: NSebastian Andrzej Siewior <bigeasy@linutronix.de>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: stable@vger.kernel.org
      Link: https://lkml.kernel.org/r/20181017103432.zgv46nlu3hc7k4rq@linutronix.de
      f77084d9
  23. 28 9月, 2018 3 次提交