- 27 3月, 2017 1 次提交
-
-
由 Kirill A. Shutemov 提交于
Straight-forward extension of existing code to support additional page table level. Signed-off-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com> Reviewed-by: NThomas Gleixner <tglx@linutronix.de> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Andy Lutomirski <luto@kernel.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Michal Hocko <mhocko@suse.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: linux-arch@vger.kernel.org Cc: linux-mm@kvack.org Link: http://lkml.kernel.org/r/20170317185515.8636-4-kirill.shutemov@linux.intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
- 30 1月, 2017 1 次提交
-
-
由 John Ogness 提交于
Although wbinvd() is faster than flushing many individual pages, it blocks the memory bus for "long" periods of time (>100us), thus directly causing unusually large latencies on all CPUs, regardless of any CPU isolation features that may be active. This is an unpriviledged operatation as it is exposed to user space via the graphics subsystem. For 1024 pages, flushing those pages individually can take up to 2200us, but the task remains fully preemptible during that time. Signed-off-by: NJohn Ogness <john.ogness@linutronix.de> Acked-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Cc: Sebastian Siewior <bigeasy@linutronix.de> Cc: linux-rt-users <linux-rt-users@vger.kernel.org> Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
-
- 25 12月, 2016 1 次提交
-
-
由 Linus Torvalds 提交于
This was entirely automated, using the script by Al: PATT='^[[:blank:]]*#[[:blank:]]*include[[:blank:]]*<asm/uaccess.h>' sed -i -e "s!$PATT!#include <linux/uaccess.h>!" \ $(git grep -l "$PATT"|grep -v ^include/linux/uaccess.h) to do the replacement at the end of the merge window. Requested-by: NAl Viro <viro@zeniv.linux.org.uk> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 20 9月, 2016 1 次提交
-
-
由 Matt Fleming 提交于
There's a mixture of signed 32-bit and unsigned 32-bit and 64-bit data types used for keeping track of how many pages have been mapped. This leads to hangs during boot when mapping large numbers of pages (multiple terabytes, as reported by Waiman) because those values are interpreted as being negative. commit 74256377 ("x86/mm/pat: Avoid truncation when converting cpa->numpages to address") fixed one of those bugs, but there is another lurking in __change_page_attr_set_clr(). Additionally, the return value type for the populate_*() functions can return negative values when a large number of pages have been mapped, triggering the error paths even though no error occurred. Consistently use 64-bit types on 64-bit platforms when counting pages. Even in the signed case this gives us room for regions 8PiB (pebibytes) in size whilst still allowing the usual negative value error checking idiom. Reported-by: NWaiman Long <waiman.long@hpe.com> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> CC: Theodore Ts'o <tytso@mit.edu> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Scott J Norton <scott.norton@hpe.com> Cc: Douglas Hatch <doug.hatch@hpe.com> Signed-off-by: NMatt Fleming <matt@codeblueprint.co.uk>
-
- 24 7月, 2016 2 次提交
-
-
由 Andy Lutomirski 提交于
In commit: 21cbc2822aa1 ("x86/mm/cpa: Unbreak populate_pgd(): stop trying to deallocate failed PUDs") I intended to add this comment, but I failed at using git. Signed-off-by: NAndy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/242baf8612394f4e31216f96d13c4d2e9b90d1b7.1469293159.git.luto@kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Andy Lutomirski 提交于
Valdis Kletnieks bisected a boot failure back to this recent commit: 360cb4d1 ("x86/mm/cpa: In populate_pgd(), don't set the PGD entry until it's populated") I broke the case where a PUD table got allocated -- populate_pud() would wander off a pgd_none entry and get lost. I'm not sure how this survived my testing. Fix the original issue in a much simpler way. The problem was that, if we allocated a PUD table, failed to populate it, and freed it, another CPU could potentially keep using the PGD entry we installed (either by copying it via vmalloc_fault or by speculatively caching it). There's a straightforward fix: simply leave the top-level entry in place if this happens. This can't waste any significant amount of memory -- there are at most 256 entries like this systemwide and, as a practical matter, if we hit this failure path repeatedly, we're likely to reuse the same page anyway. For context, this is a reversion with this hunk added in: if (ret < 0) { + /* + * Leave the PUD page in place in case some other CPU or thread + * already found it, but remove any useless entries we just + * added to it. + */ - unmap_pgd_range(cpa->pgd, addr, + unmap_pud_range(pgd_entry, addr, addr + (cpa->numpages << PAGE_SHIFT)); return ret; } This effectively open-codes what the now-deleted unmap_pgd_range() function used to do except that unmap_pgd_range() used to try to free the page as well. Reported-by: NValdis Kletnieks <Valdis.Kletnieks@vt.edu> Signed-off-by: NAndy Lutomirski <luto@kernel.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Luis R. Rodriguez <mcgrof@suse.com> Cc: Mike Krinkin <krinkin.m.u@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Toshi Kani <toshi.kani@hp.com> Link: http://lkml.kernel.org/r/21cbc2822aa18aa812c0215f4231dbf5f65afa7f.1469249789.git.luto@kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
- 15 7月, 2016 2 次提交
-
-
由 Andy Lutomirski 提交于
kernel_unmap_pages_in_pgd() is dangerous: if a PGD entry in init_mm.pgd were to be cleared, callers would need to ensure that the pgd entry hadn't been propagated to any other pgd. Its only caller was efi_cleanup_page_tables(), and that, in turn, was unused, so just delete both functions. This leaves a couple of other helpers unused, so delete them, too. Signed-off-by: NAndy Lutomirski <luto@kernel.org> Reviewed-by: NMatt Fleming <matt@codeblueprint.co.uk> Acked-by: NBorislav Petkov <bp@suse.de> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-efi@vger.kernel.org Link: http://lkml.kernel.org/r/77ff20fdde3b75cd393be5559ad8218870520248.1468527351.git.luto@kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Andy Lutomirski 提交于
This avoids pointless races in which another CPU or task might see a partially populated global PGD entry. These races should normally be harmless, but, if another CPU propagates the entry via vmalloc_fault() and then populate_pgd() fails (due to memory allocation failure, for example), this prevents a use-after-free of the PGD entry. Signed-off-by: NAndy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/bf99df27eac6835f687005364bd1fbd89130946c.1468527351.git.luto@kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
- 13 7月, 2016 1 次提交
-
-
由 Dave Hansen 提交于
The page table manipulation code seems to have grown a couple of sites that are looking for empty PTEs. Just in case one of these entries got a stray bit set, use pte_none() instead of checking for a zero pte_val(). The use pte_same() makes me a bit nervous. If we were doing a pte_same() check against two cleared entries and one of them had a stray bit set, it might fail the pte_same() check. But, I don't think we ever _do_ pte_same() for cleared entries. It is almost entirely used for checking for races in fault-in paths. Signed-off-by: NDave Hansen <dave.hansen@linux.intel.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dave Hansen <dave@sr71.net> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Luis R. Rodriguez <mcgrof@suse.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Toshi Kani <toshi.kani@hp.com> Cc: dave.hansen@intel.com Cc: linux-mm@kvack.org Cc: mhocko@suse.com Link: http://lkml.kernel.org/r/20160708001915.813703D9@viggo.jf.intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
- 10 7月, 2016 1 次提交
-
-
由 Thomas Garnier 提交于
The new physical address randomized KASLR implementation can cause the kernel to be aligned close to the end of physical memory. In this case, _brk_end aligned to PMD will go beyond what is expected safe and hit the assert in __phys_addr_symbol(): VIRTUAL_BUG_ON(y >= KERNEL_IMAGE_SIZE); Instead, perform an inclusive range check to avoid incorrectly triggering the assert: kernel BUG at arch/x86/mm/physaddr.c:38! invalid opcode: 0000 [#1] SMP ... RIP: 0010:[<ffffffffbe055721>] __phys_addr_symbol+0x41/0x50 ... Call Trace: [<ffffffffbe052eb9>] cpa_process_alias+0xa9/0x210 [<ffffffffbe109011>] ? do_raw_spin_unlock+0xc1/0x100 [<ffffffffbe051eef>] __change_page_attr_set_clr+0x8cf/0xbd0 [<ffffffffbe201a4d>] ? vm_unmap_aliases+0x7d/0x210 [<ffffffffbe05237c>] change_page_attr_set_clr+0x18c/0x4e0 [<ffffffffbe0534ec>] set_memory_4k+0x2c/0x40 [<ffffffffbefb08b3>] check_bugs+0x28/0x2a [<ffffffffbefa4f52>] start_kernel+0x49d/0x4b9 [<ffffffffbefa4120>] ? early_idt_handler_array+0x120/0x120 [<ffffffffbefa4423>] x86_64_start_reservations+0x29/0x2b [<ffffffffbefa4568>] x86_64_start_kernel+0x143/0x152 Signed-off-by: NThomas Garnier <thgarnie@google.com> Signed-off-by: NKees Cook <keescook@chromium.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Borislav Petkov <bp@suse.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Dexuan Cui <decui@microsoft.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Matt Fleming <matt@codeblueprint.co.uk> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sai Praneeth <sai.praneeth.prakhya@intel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Toshi Kani <toshi.kani@hpe.com> Link: http://lkml.kernel.org/r/20160615190545.GA26071@www.outflux.netSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
- 28 4月, 2016 1 次提交
-
-
由 Matt Fleming 提交于
It's not at all obvious that populate_pgd() and friends are only executed when mapping EFI virtual memory regions or that no other pageattr callers pass a ->pgd value. Reported-by: NAndy Lutomirski <luto@amacapital.net> Signed-off-by: NMatt Fleming <matt@codeblueprint.co.uk> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sai Praneeth Prakhya <sai.praneeth.prakhya@intel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-efi@vger.kernel.org Link: http://lkml.kernel.org/r/1461614832-17633-4-git-send-email-matt@codeblueprint.co.ukSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
- 31 3月, 2016 2 次提交
-
-
由 Borislav Petkov 提交于
Use the fast variant in the DRM code. Signed-off-by: NBorislav Petkov <bp@suse.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: dri-devel@lists.freedesktop.org Cc: intel-gfx@lists.freedesktop.org Link: http://lkml.kernel.org/r/1459266123-21878-7-git-send-email-bp@alien8.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Borislav Petkov 提交于
Signed-off-by: NBorislav Petkov <bp@suse.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1459266123-21878-6-git-send-email-bp@alien8.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
- 16 3月, 2016 2 次提交
-
-
由 Matt Fleming 提交于
Scott reports that with the new separate EFI page tables he's seeing the following error on boot, caused by setting reserved bits in the page table structures (fault code is PF_RSVD | PF_PROT), swapper/0: Corrupted page table at address 17b102020 PGD 17b0e5063 PUD 1400000e3 Bad pagetable: 0009 [#1] SMP On first inspection the PUD is using a 1GB page size (_PAGE_PSE) and looks fine but that's only true if support for 1GB PUD pages ("pdpe1gb") is present in the CPU. Scott's Intel Celeron N2820 does not have that feature and so the _PAGE_PSE bit is reserved. Fix this issue by making the 1GB mapping code in conditional on "cpu_has_gbpages". This issue didn't come up in the past because the required mapping for the faulting address (0x17b102020) will already have been setup by the kernel in early boot before we got to efi_map_regions(), but we no longer use the standard kernel page tables during EFI calls. Reported-by: NScott Ashcroft <scott.ashcroft@talk21.com> Tested-by: NScott Ashcroft <scott.ashcroft@talk21.com> Signed-off-by: NMatt Fleming <matt@codeblueprint.co.uk> Acked-by: NBorislav Petkov <bp@suse.de> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Ben Hutchings <ben@decadent.org.uk> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Cc: Matthew Garrett <mjg59@srcf.ucam.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Raphael Hertzog <hertzog@debian.org> Cc: Roger Shimizu <rogershimizu@gmail.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-efi@vger.kernel.org Link: http://lkml.kernel.org/r/1457951581-27353-2-git-send-email-matt@codeblueprint.co.ukSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Christian Borntraeger 提交于
We can use debug_pagealloc_enabled() to check if we can map the identity mapping with 2MB pages. We can also add the state into the dump_stack output. The patch does not touch the code for the 1GB pages, which ignored CONFIG_DEBUG_PAGEALLOC. Do we need to fence this as well? Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com> Reviewed-by: NThomas Gleixner <tglx@linutronix.de> Acked-by: NDavid Rientjes <rientjes@google.com> Cc: Laura Abbott <labbott@fedoraproject.org> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 26 2月, 2016 1 次提交
-
-
由 Dexuan Cui 提交于
"d1cd1210: x86, pageattr: Prevent overflow in slow_virt_to_phys() for X86_PAE" was unintentionally removed by the recent "34437e67: x86/mm: Fix slow_virt_to_phys() to handle large PAT bit". And, the variable 'phys_addr' was defined as "unsigned long" by mistake -- it should be "phys_addr_t". As a result, Hyper-V network driver in 32-PAE Linux guest can't work again. Fixes: commit 34437e67: "x86/mm: Fix slow_virt_to_phys() to handle large PAT bit" Signed-off-by: NDexuan Cui <decui@microsoft.com> Reviewed-by: NToshi Kani <toshi.kani@hpe.com> Cc: olaf@aepfle.de Cc: gregkh@linuxfoundation.org Cc: jasowang@redhat.com Cc: driverdev-devel@linuxdriverproject.org Cc: linux-mm@kvack.org Cc: apw@canonical.com Cc: Andrew Morton <akpm@linux-foundation.org> Cc: K. Y. Srinivasan <kys@microsoft.com> Cc: Haiyang Zhang <haiyangz@microsoft.com> Link: http://lkml.kernel.org/r/1456394292-9030-1-git-send-email-decui@microsoft.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
-
- 25 2月, 2016 1 次提交
-
-
由 Jan Beulich 提交于
set_memory_nx() (and set_memory_x()) currently differ in behavior from all other set_memory_*() functions when encountering a virtual address space hole within the kernel address range: They stop processing at the hole, but nevertheless report success (making the caller believe the operation was carried out on the entire range). While observed to be a problem - triggering the CONFIG_DEBUG_WX warning - only with out of tree code, I suspect (but didn't check) that on x86-64 the CONFIG_DEBUG_PAGEALLOC logic in free_init_pages() would, when called from free_initmem(), have the same effect on the set_memory_nx() called from mark_rodata_ro(). This unexpected behavior is a result of change_page_attr_set_clr() special casing changes to only the NX bit, in that it passes "false" as the "checkalias" argument to __change_page_attr_set_clr(). Since this flag becomes the "primary" argument of both __change_page_attr() and __cpa_process_fault(), the latter would so far return success without adjusting cpa->numpages. Success to the higher level callers, however, means that whatever cpa->numpages currently holds is the count of successfully processed pages. The cases when __change_page_attr() calls __cpa_process_fault(), otoh, don't generally mean the entire range got processed (as can be seen from one of the two success return paths in __cpa_process_fault() already adjusting ->numpages). Signed-off-by: NJan Beulich <jbeulich@suse.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/56BB0AD402000078000D05BF@prv-mh.provo.novell.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
-
- 22 2月, 2016 3 次提交
-
-
由 Kees Cook 提交于
This removes the CONFIG_DEBUG_RODATA option and makes it always enabled. This simplifies the code and also makes it clearer that read-only mapped memory is just as fundamental a security feature in kernel-space as it is in user-space. Suggested-by: NIngo Molnar <mingo@kernel.org> Signed-off-by: NKees Cook <keescook@chromium.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: David Brown <david.brown@linaro.org> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Emese Revfy <re.emese@gmail.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mathias Krause <minipli@googlemail.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: PaX Team <pageexec@freemail.hu> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: kernel-hardening@lists.openwall.com Cc: linux-arch <linux-arch@vger.kernel.org> Link: http://lkml.kernel.org/r/1455748879-21872-4-git-send-email-keescook@chromium.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Sai Praneeth 提交于
As part of the preparation for the EFI_MEMORY_RO flag added in the UEFI 2.5 specification, we need the ability to map pages in kernel page tables without _PAGE_RW being set. Modify kernel_map_pages_in_pgd() to require its callers to pass _PAGE_RW if the pages need to be mapped read/write. Otherwise, we'll map the pages as read-only. Signed-off-by: NSai Praneeth Prakhya <sai.praneeth.prakhya@intel.com> Signed-off-by: NMatt Fleming <matt@codeblueprint.co.uk> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Hugh Dickins <hughd@google.com> Cc: Lee, Chun-Yi <jlee@suse.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Luis R. Rodriguez <mcgrof@suse.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Shankar <ravi.v.shankar@intel.com> Cc: Ricardo Neri <ricardo.neri@intel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Toshi Kani <toshi.kani@hp.com> Cc: linux-efi@vger.kernel.org Link: http://lkml.kernel.org/r/1455712566-16727-12-git-send-email-matt@codeblueprint.co.ukSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Sai Praneeth 提交于
Since EFI page tables can be treated as kernel page tables they should be global. All the other page mapping functions in pageattr.c set the _PAGE_GLOBAL bit and we want to avoid inconsistencies when we map a page in the EFI code paths, for example when that page is split in __split_large_page(), etc. It also makes it easier to validate that the EFI region mappings have the correct attributes because there are fewer differences compared with regular kernel mappings. Signed-off-by: NSai Praneeth Prakhya <sai.praneeth.prakhya@intel.com> Signed-off-by: NMatt Fleming <matt@codeblueprint.co.uk> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Hugh Dickins <hughd@google.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Luis R. Rodriguez <mcgrof@suse.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Shankar <ravi.v.shankar@intel.com> Cc: Ricardo Neri <ricardo.neri@intel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Toshi Kani <toshi.kani@hp.com> Cc: linux-efi@vger.kernel.org Link: http://lkml.kernel.org/r/1455712566-16727-4-git-send-email-matt@codeblueprint.co.ukSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
- 29 1月, 2016 1 次提交
-
-
由 Matt Fleming 提交于
There are a couple of nasty truncation bugs lurking in the pageattr code that can be triggered when mapping EFI regions, e.g. when we pass a cpa->pgd pointer. Because cpa->numpages is a 32-bit value, shifting left by PAGE_SHIFT will truncate the resultant address to 32-bits. Viorel-Cătălin managed to trigger this bug on his Dell machine that provides a ~5GB EFI region which requires 1236992 pages to be mapped. When calling populate_pud() the end of the region gets calculated incorrectly in the following buggy expression, end = start + (cpa->numpages << PAGE_SHIFT); And only 188416 pages are mapped. Next, populate_pud() gets invoked for a second time because of the loop in __change_page_attr_set_clr(), only this time no pages get mapped because shifting the remaining number of pages (1048576) by PAGE_SHIFT is zero. At which point the loop in __change_page_attr_set_clr() spins forever because we fail to map progress. Hitting this bug depends very much on the virtual address we pick to map the large region at and how many pages we map on the initial run through the loop. This explains why this issue was only recently hit with the introduction of commit a5caa209 ("x86/efi: Fix boot crash by mapping EFI memmap entries bottom-up at runtime, instead of top-down") It's interesting to note that safe uses of cpa->numpages do exist in the pageattr code. If instead of shifting ->numpages we multiply by PAGE_SIZE, no truncation occurs because PAGE_SIZE is a UL value, and so the result is unsigned long. To avoid surprises when users try to convert very large cpa->numpages values to addresses, change the data type from 'int' to 'unsigned long', thereby making it suitable for shifting by PAGE_SHIFT without any type casting. The alternative would be to make liberal use of casting, but that is far more likely to cause problems in the future when someone adds more code and fails to cast properly; this bug was difficult enough to track down in the first place. Reported-and-tested-by: NViorel-Cătălin Răpițeanu <rapiteanu.catalin@gmail.com> Acked-by: NBorislav Petkov <bp@alien8.de> Cc: Sai Praneeth Prakhya <sai.praneeth.prakhya@intel.com> Cc: <stable@vger.kernel.org> Signed-off-by: NMatt Fleming <matt@codeblueprint.co.uk> Link: https://bugzilla.kernel.org/show_bug.cgi?id=110131 Link: http://lkml.kernel.org/r/1454067370-10374-1-git-send-email-matt@codeblueprint.co.ukSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
-
- 12 1月, 2016 1 次提交
-
-
由 Dave Jones 提交于
In CONFIG_PAGEALLOC_DEBUG=y builds, we disable 2M pages. Unfortunatly when we split up mappings during boot, split_page_count() doesn't take this into account, and starts decrementing an empty direct_pages_count[] level. This results in /proc/meminfo showing crazy things like: DirectMap2M: 18446744073709543424 kB Signed-off-by: NDave Jones <davej@codemonkey.org.uk> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Luis R. Rodriguez <mcgrof@suse.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Toshi Kani <toshi.kani@hp.com> Signed-off-by: NIngo Molnar <mingo@kernel.org>
-
- 09 1月, 2016 1 次提交
-
-
由 Chris Wilson 提交于
Whilst inspecting the asm for clflush_cache_range() and some perf profiles that required extensive flushing of single cachelines (from part of the intel-gpu-tools GPU benchmarks), we noticed that gcc was reloading boot_cpu_data.x86_clflush_size on every iteration of the loop. We can manually hoist that read which perf regarded as taking ~25% of the function time for a single cacheline flush. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NRoss Zwisler <ross.zwisler@linux.intel.com> Acked-by: N"H. Peter Anvin" <hpa@zytor.com> Cc: Toshi Kani <toshi.kani@hpe.com> Cc: Borislav Petkov <bp@suse.de> Cc: Luis R. Rodriguez <mcgrof@suse.com> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Cc: Sai Praneeth <sai.praneeth.prakhya@intel.com> Link: http://lkml.kernel.org/r/1452246933-10890-1-git-send-email-chris@chris-wilson.co.ukSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
-
- 29 11月, 2015 1 次提交
-
-
由 Matt Fleming 提交于
The x86 pageattr code is confused about the data that is stored in cpa->pfn, sometimes it's treated as a page frame number, sometimes it's treated as an unshifted physical address, and in one place it's treated as a pte. The result of this is that the mapping functions do not map the intended physical address. This isn't a problem in practice because most of the addresses we're mapping in the EFI code paths are already mapped in 'trampoline_pgd' and so the pageattr mapping functions don't actually do anything in this case. But when we move to using a separate page table for the EFI runtime this will be an issue. Signed-off-by: NMatt Fleming <matt@codeblueprint.co.uk> Reviewed-by: NBorislav Petkov <bp@suse.de> Acked-by: NBorislav Petkov <bp@suse.de> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sai Praneeth Prakhya <sai.praneeth.prakhya@intel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Toshi Kani <toshi.kani@hp.com> Cc: linux-efi@vger.kernel.org Link: http://lkml.kernel.org/r/1448658575-17029-3-git-send-email-matt@codeblueprint.co.ukSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
- 25 10月, 2015 1 次提交
-
-
由 Sai Praneeth 提交于
When CONFIG_DEBUG_VIRTUAL is enabled, all accesses to __pa(address) are monitored to see whether address falls in direct mapping or kernel text mapping (see Documentation/x86/x86_64/mm.txt for details), if it does not, the kernel panics. During 1:1 mapping of EFI runtime services we access virtual addresses which are == physical addresses, thus the 1:1 mapping and these addresses do not fall in either of the above two regions and hence when passed as arguments to __pa() kernel panics as reported by Dave Hansen here https://lkml.kernel.org/r/5462999A.7090706@intel.com. So, before calling __pa() virtual addresses should be validated which results in skipping call to split_page_count() and that should be fine because it is used to keep track of everything *but* 1:1 mappings. Signed-off-by: NSai Praneeth Prakhya <sai.praneeth.prakhya@intel.com> Reported-by: NDave Hansen <dave.hansen@intel.com> Reviewed-by: NBorislav Petkov <bp@suse.de> Cc: Ricardo Neri <ricardo.neri@intel.com> Cc: Glenn P Williamson <glenn.p.williamson@intel.com> Cc: Ravi Shankar <ravi.v.shankar@intel.com> Signed-off-by: NMatt Fleming <matt@codeblueprint.co.uk>
-
- 23 9月, 2015 4 次提交
-
-
由 Toshi Kani 提交于
try_preserve_large_page() checks if new_prot is the same as old_prot. If so, it simply sets do_split to 0, and returns with no-operation. However, old_prot is set as a 4KB pgprot value while new_prot is a large page pgprot value. Now that old_prot is initially set from p?d_pgprot() as a large page pgprot value, fix it by not overwriting old_prot with a 4KB pgprot value. Signed-off-by: NToshi Kani <toshi.kani@hpe.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Juergen Gross <jgross@suse.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Konrad Wilk <konrad.wilk@oracle.com> Cc: Robert Elliot <elliott@hpe.com> Cc: linux-mm@kvack.org Link: http://lkml.kernel.org/r/1442514264-12475-12-git-send-email-toshi.kani@hpe.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
-
由 Toshi Kani 提交于
__split_large_page() is called from __change_page_attr() to change the mapping attribute by splitting a given large page into smaller pages. This function uses pte_pfn() and pte_pgprot() for PUD/PMD, which do not handle the large PAT bit properly. Fix __split_large_page() by using the corresponding pud/pmd pfn/ pgprot interfaces. Also remove '#ifdef CONFIG_X86_64', which is not necessary. Signed-off-by: NToshi Kani <toshi.kani@hpe.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Juergen Gross <jgross@suse.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Konrad Wilk <konrad.wilk@oracle.com> Cc: Robert Elliot <elliott@hpe.com> Cc: linux-mm@kvack.org Link: http://lkml.kernel.org/r/1442514264-12475-11-git-send-email-toshi.kani@hpe.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
-
由 Toshi Kani 提交于
try_preserve_large_page() is called from __change_page_attr() to change the mapping attribute of a given large page. This function uses pte_pfn() and pte_pgprot() for PUD/PMD, which do not handle the large PAT bit properly. Fix try_preserve_large_page() by using the corresponding pud/pmd prot/pfn interfaces. Also remove '#ifdef CONFIG_X86_64', which is not necessary. Signed-off-by: NToshi Kani <toshi.kani@hpe.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Juergen Gross <jgross@suse.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Konrad Wilk <konrad.wilk@oracle.com> Cc: Robert Elliot <elliott@hpe.com> Cc: linux-mm@kvack.org Link: http://lkml.kernel.org/r/1442514264-12475-10-git-send-email-toshi.kani@hpe.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
-
由 Toshi Kani 提交于
slow_virt_to_phys() calls lookup_address() to obtain *pte and its level. It then calls pte_pfn() to obtain a physical address for any level. However, this physical address is not correct when the large PAT bit is set because pte_pfn() does not mask the large PAT bit properly for PUD/PMD. Fix slow_virt_to_phys() to use pud_pfn() and pmd_pfn() for 1GB and 2MB mapping levels. Signed-off-by: NToshi Kani <toshi.kani@hpe.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Juergen Gross <jgross@suse.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Konrad Wilk <konrad.wilk@oracle.com> Cc: Robert Elliot <elliott@hpe.com> Cc: linux-mm@kvack.org Link: http://lkml.kernel.org/r/1442514264-12475-8-git-send-email-toshi.kani@hpe.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
-
- 25 8月, 2015 1 次提交
-
-
由 Paul Gortmaker 提交于
The file pageattr.c is obj-y and it includes pageattr-test.c based on CPA_DEBUG (a bool), meaning that no code here is currently being built as a module by anyone. Lets remove the couple traces of modularity so that when reading the code there is no doubt it is builtin-only. Since module_init translates to device_initcall in the non-modular case, the init ordering remains unchanged with this commit. Signed-off-by: NPaul Gortmaker <paul.gortmaker@windriver.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1440459295-21814-3-git-send-email-paul.gortmaker@windriver.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
- 07 6月, 2015 2 次提交
-
-
由 Toshi Kani 提交于
Now that reserve_ram_pages_type() accepts the WT type, add set_memory_wt(), set_memory_array_wt() and set_pages_array_wt() in order to be able to set memory to Write-Through page cache mode. Also, extend ioremap_change_attr() to accept the WT type. Signed-off-by: NToshi Kani <toshi.kani@hp.com> Signed-off-by: NBorislav Petkov <bp@suse.de> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Elliott@hp.com Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Luis R. Rodriguez <mcgrof@suse.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: arnd@arndb.de Cc: hch@lst.de Cc: hmh@hmh.eng.br Cc: jgross@suse.com Cc: konrad.wilk@oracle.com Cc: linux-mm <linux-mm@kvack.org> Cc: linux-nvdimm@lists.01.org Cc: stefan.bader@canonical.com Cc: yigal@plexistor.com Link: http://lkml.kernel.org/r/1433436928-31903-13-git-send-email-bp@alien8.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Borislav Petkov 提交于
Now that we emulate a PAT table when PAT is disabled, there's no need for those checks anymore as the PAT abstraction will handle those cases too. Based on a conglomerate patch from Toshi Kani. Signed-off-by: NBorislav Petkov <bp@suse.de> Reviewed-by: NToshi Kani <toshi.kani@hp.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Elliott@hp.com Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Luis R. Rodriguez <mcgrof@suse.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: arnd@arndb.de Cc: hch@lst.de Cc: hmh@hmh.eng.br Cc: jgross@suse.com Cc: konrad.wilk@oracle.com Cc: linux-mm <linux-mm@kvack.org> Cc: linux-nvdimm@lists.01.org Cc: stefan.bader@canonical.com Cc: yigal@plexistor.com Link: http://lkml.kernel.org/r/1433436928-31903-4-git-send-email-bp@alien8.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
- 03 6月, 2015 1 次提交
-
-
由 Stephen Rothwell 提交于
Nothing in <asm/io.h> uses anything from <linux/vmalloc.h>, so remove it from there and fix up the resulting build problems triggered on x86 {64|32}-bit {def|allmod|allno}configs. The breakages were triggering in places where x86 builds relied on vmalloc() facilities but did not include <linux/vmalloc.h> explicitly and relied on the implicit inclusion via <asm/io.h>. Also add: - <linux/init.h> to <linux/io.h> - <asm/pgtable_types> to <asm/io.h> ... which were two other implicit header file dependencies. Suggested-by: NDavid Miller <davem@davemloft.net> Signed-off-by: NStephen Rothwell <sfr@canb.auug.org.au> [ Tidied up the changelog. ] Acked-by: NDavid Miller <davem@davemloft.net> Acked-by: NTakashi Iwai <tiwai@suse.de> Acked-by: NViresh Kumar <viresh.kumar@linaro.org> Acked-by: NVinod Koul <vinod.koul@intel.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Anton Vorontsov <anton@enomsg.org> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Colin Cross <ccross@android.com> Cc: David Vrabel <david.vrabel@citrix.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Haiyang Zhang <haiyangz@microsoft.com> Cc: James E.J. Bottomley <JBottomley@odin.com> Cc: Jaroslav Kysela <perex@perex.cz> Cc: K. Y. Srinivasan <kys@microsoft.com> Cc: Kees Cook <keescook@chromium.org> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Kristen Carlson Accardi <kristen@linux.intel.com> Cc: Len Brown <lenb@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rafael J. Wysocki <rjw@rjwysocki.net> Cc: Suma Ramars <sramars@cisco.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tony Luck <tony.luck@intel.com> Signed-off-by: NIngo Molnar <mingo@kernel.org>
-
- 27 5月, 2015 1 次提交
-
-
由 Luis R. Rodriguez 提交于
We use pat_enabled in x86-specific code to see if PAT is enabled or not but we're granting full access to it even though readers do not need to set it. If, for instance, we granted access to it to modules later they then could override the variable setting... no bueno. This renames pat_enabled to a new static variable __pat_enabled. Folks are redirected to use pat_enabled() now. Code that sets this can only be internal to pat.c. Apart from the early kernel parameter "nopat" to disable PAT, we also have a few cases that disable it later and make use of a helper pat_disable(). It is wrapped under an ifdef but since that code cannot run unless PAT was enabled its not required to wrap it with ifdefs, unwrap that. Likewise, since "nopat" doesn't really change non-PAT systems just remove that ifdef as well. Although we could add and use an early_param_off(), these helpers don't use __read_mostly but we want to keep __read_mostly for __pat_enabled as this is a hot path -- upon boot, for instance, a simple guest may see ~4k accesses to pat_enabled(). Since __read_mostly early boot params are not that common we don't add a helper for them just yet. Signed-off-by: NLuis R. Rodriguez <mcgrof@suse.com> Signed-off-by: NBorislav Petkov <bp@suse.de> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Andy Walls <awalls@md.metrocast.net> Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Christoph Lameter <cl@linux.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Dave Airlie <airlied@redhat.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Doug Ledford <dledford@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Juergen Gross <jgross@suse.com> Cc: Kyle McMartin <kyle@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Michael S. Tsirkin <mst@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1430425520-22275-3-git-send-email-mcgrof@do-not-panic.com Link: http://lkml.kernel.org/r/1432628901-18044-13-git-send-email-bp@alien8.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
- 11 5月, 2015 3 次提交
-
-
由 Dexuan Cui 提交于
The patch doesn't change any logic. Signed-off-by: NDexuan Cui <decui@microsoft.com> Cc: Borislav Petkov <bp@alien8.de> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1429776428-4475-1-git-send-email-decui@microsoft.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Luis R. Rodriguez 提交于
ioremap_nocache() currently uses UC- by default. Our goal is to eventually make UC the default. Linux maps UC- to PCD=1, PWT=0 page attributes on non-PAT systems. Linux maps UC to PCD=1, PWT=1 page attributes on non-PAT systems. On non-PAT and PAT systems a WC MTRR has different effects on pages with either of these attributes. In order to help with a smooth transition its best to enable use of UC (PCD,1, PWT=1) on a region as that ensures a WC MTRR will have no effect on a region, this however requires us to have an way to declare a region as UC and we currently do not have a way to do this. WC MTRR on non-PAT system with PCD=1, PWT=0 (UC-) yields WC. WC MTRR on non-PAT system with PCD=1, PWT=1 (UC) yields UC. WC MTRR on PAT system with PCD=1, PWT=0 (UC-) yields WC. WC MTRR on PAT system with PCD=1, PWT=1 (UC) yields UC. A flip of the default ioremap_nocache() behaviour from UC- to UC can therefore regress a memory region from effective memory type WC to UC if MTRRs are used. Use of MTRRs should be phased out and in the best case only arch_phys_wc_add() use will remain, even if this happens arch_phys_wc_add() will have an effect on non-PAT systems and changes to default ioremap_nocache() behaviour could regress drivers. Now, ideally we'd use ioremap_nocache() on the regions in which we'd need uncachable memory types and avoid any MTRRs on those regions. There are however some restrictions on MTRRs use, such as the requirement of having the base and size of variable sized MTRRs to be powers of two, which could mean having to use a WC MTRR over a large area which includes a region in which write-combining effects are undesirable. Add ioremap_uc() to help with the both phasing out of MTRR use and also provide a way to blacklist small WC undesirable regions in devices with mixed regions which are size-implicated to use large WC MTRRs. Use of ioremap_uc() helps phase out MTRR use by avoiding regressions with an eventual flip of default behaviour or ioremap_nocache() from UC- to UC. Drivers working with WC MTRRs can use the below table to review and consider the use of ioremap*() and similar helpers to ensure appropriate behaviour long term even if default ioremap_nocache() behaviour changes from UC- to UC. Although ioremap_uc() is being added we leave set_memory_uc() to use UC- as only initial memory type setup is required to be able to accommodate existing device drivers and phase out MTRR use. It should also be clarified that set_memory_uc() cannot be used with IO memory, even though its use will not return any errors, it really has no effect. ---------------------------------------------------------------------- MTRR Non-PAT PAT Linux ioremap value Effective memory type ---------------------------------------------------------------------- Non-PAT | PAT PAT |PCD ||PWT ||| WC 000 WB _PAGE_CACHE_MODE_WB WC | WC WC 001 WC _PAGE_CACHE_MODE_WC WC* | WC WC 010 UC- _PAGE_CACHE_MODE_UC_MINUS WC* | WC WC 011 UC _PAGE_CACHE_MODE_UC UC | UC ---------------------------------------------------------------------- Signed-off-by: NLuis R. Rodriguez <mcgrof@suse.com> Signed-off-by: NBorislav Petkov <bp@suse.de> Acked-by: NH. Peter Anvin <hpa@zytor.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Antonino Daplas <adaplas@gmail.com> Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Dave Airlie <airlied@redhat.com> Cc: Davidlohr Bueso <dbueso@suse.de> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Jean-Christophe Plagniol-Villard <plagnioj@jcrosoft.com> Cc: Juergen Gross <jgross@suse.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mel Gorman <mgorman@suse.de> Cc: Mike Travis <travis@sgi.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Suresh Siddha <sbsiddha@gmail.com> Cc: Thierry Reding <treding@nvidia.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tomi Valkeinen <tomi.valkeinen@ti.com> Cc: Toshi Kani <toshi.kani@hp.com> Cc: Ville Syrjälä <syrjala@sci.fi> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Will Deacon <will.deacon@arm.com> Cc: linux-fbdev@vger.kernel.org Link: http://lkml.kernel.org/r/1430343851-967-2-git-send-email-mcgrof@do-not-panic.com Link: http://lkml.kernel.org/r/1431332153-18566-9-git-send-email-bp@alien8.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Ross Zwisler 提交于
The current algorithm used in clflush_cache_range() can cause the last cache line of the buffer to be flushed twice. Fix that algorithm so that each cache line will only be flushed once. Reported-by: NH. Peter Anvin <hpa@zytor.com> Signed-off-by: NRoss Zwisler <ross.zwisler@linux.intel.com> Signed-off-by: NBorislav Petkov <bp@suse.de> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Luis R. Rodriguez <mcgrof@suse.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Toshi Kani <toshi.kani@hp.com> Link: http://lkml.kernel.org/r/1430259192-18802-1-git-send-email-ross.zwisler@linux.intel.com Link: http://lkml.kernel.org/r/1431332153-18566-5-git-send-email-bp@alien8.de [ Changed it to 'void *' to simplify the type conversions. ] Signed-off-by: NIngo Molnar <mingo@kernel.org>
-
- 05 3月, 2015 1 次提交
-
-
由 Luis R. Rodriguez 提交于
direct_gbpages can be force enabled as an early parameter but not really have taken effect when DEBUG_PAGEALLOC or KMEMCHECK is enabled. You can also enable direct_gbpages right now if you have an x86_64 architecture but your CPU doesn't really have support for this feature. In both cases PG_LEVEL_1G won't actually be enabled but direct_gbpages is used in other areas under the assumptions that PG_LEVEL_1G was set. Fix this by putting together all requirements which make this feature sensible to enable under, and only enable both finally flipping on PG_LEVEL_1G and leaving PG_LEVEL_1G set when this is true. We only enable this feature then to be possible on sensible builds defined by the new ENABLE_DIRECT_GBPAGES. If the CPU has support for it you can either enable this by using the DIRECT_GBPAGES option or using the early kernel parameter. If a platform had support for this you can always force disable it as well. Signed-off-by: NLuis R. Rodriguez <mcgrof@suse.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Borislav Petkov <bp@suse.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: David Vrabel <david.vrabel@citrix.com> Cc: Dexuan Cui <decui@microsoft.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: H. Peter Anvin <hpa@zytor.com> Cc: JBeulich@suse.com Cc: Jan Beulich <JBeulich@suse.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Juergen Gross <jgross@suse.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Pavel Machek <pavel@ucw.cz> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tony Lindgren <tony@atomide.com> Cc: Toshi Kani <toshi.kani@hp.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Xishi Qiu <qiuxishi@huawei.com> Cc: julia.lawall@lip6.fr Link: http://lkml.kernel.org/r/1425518654-3403-3-git-send-email-mcgrof@do-not-panic.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
- 28 2月, 2015 1 次提交
-
-
由 Daniel Borkmann 提交于
This effectively unexports set_memory_ro() and set_memory_rw() functions, and thus reverts: a03352d2 ("x86: export set_memory_ro and set_memory_rw"). They have been introduced for debugging purposes in e1000e, but no module user is in mainline kernel (anymore?) and we explicitly do not want modules to use these functions, as they i.e. protect eBPF (interpreted & JIT'ed) images from malicious modifications or bugs. Outside of eBPF scope, I believe also other set_memory_*() functions should be unexported on x86 for modules. Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net> Acked-by: NAlexei Starovoitov <ast@plumgrid.com> Cc: Arjan van de Ven <arjan@linux.intel.com> Cc: Borislav Petkov <bp@suse.de> Cc: Bruce Allan <bruce.w.allan@intel.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Jesse Brandeburg <jesse.brandeburg@intel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: davem@davemloft.net Link: http://lkml.kernel.org/r/a064393a0a5d319eebde5c761cfd743132d4f213.1425040940.git.daniel@iogearbox.netSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
- 14 12月, 2014 1 次提交
-
-
由 Joonsoo Kim 提交于
Now, we have prepared to avoid using debug-pagealloc in boottime. So introduce new kernel-parameter to disable debug-pagealloc in boottime, and makes related functions to be disabled in this case. Only non-intuitive part is change of guard page functions. Because guard page is effective only if debug-pagealloc is enabled, turning off according to debug-pagealloc is reasonable thing to do. Signed-off-by: NJoonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Minchan Kim <minchan@kernel.org> Cc: Dave Hansen <dave@sr71.net> Cc: Michal Nazarewicz <mina86@mina86.com> Cc: Jungsoo Son <jungsoo.son@lge.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-