- 01 10月, 2018 2 次提交
-
-
由 Anshuman Khandual 提交于
Most memory abort exception handling related functions have the arguments in the order (addr, esr, regs) except is_el1_permission_fault(). This changes the argument order in this function as (addr, esr, regs) like others. Acked-by: NMark Rutland <mark.rutland@arm.com> Signed-off-by: NAnshuman Khandual <anshuman.khandual@arm.com> Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
-
由 Anshuman Khandual 提交于
Just replace hard code value of 63 (0x111111) with an existing macro ESR_ELx_FSC when parsing for the status code during fault exception. Acked-by: NMark Rutland <mark.rutland@arm.com> Signed-off-by: NAnshuman Khandual <anshuman.khandual@arm.com> Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
-
- 25 9月, 2018 2 次提交
-
-
由 Jun Yao 提交于
Once swapper_pg_dir is in the rodata section, it will not be possible to modify it directly, but we will need to modify it in some cases. To enable this, we can use the fixmap when deliberately modifying swapper_pg_dir. As the pgd is only transiently mapped, this provides some resilience against illicit modification of the pgd, e.g. for Kernel Space Mirror Attack (KSMA). Signed-off-by: NJun Yao <yaojun8558363@gmail.com> Reviewed-by: NJames Morse <james.morse@arm.com> [Mark: simplify ifdeffery, commit message] Signed-off-by: NMark Rutland <mark.rutland@arm.com> Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
-
由 Jun Yao 提交于
Since the address of swapper_pg_dir is fixed for a given kernel image, it is an attractive target for manipulation via an arbitrary write. To mitigate this we'd like to make it read-only by moving it into the rodata section. We require that swapper_pg_dir is at a fixed offset from tramp_pg_dir and reserved_ttbr0, so these will also need to move into rodata. However, swapper_pg_dir is allocated along with some transient page tables used for boot which we do not want to move into rodata. As a step towards this, this patch separates the boot-time page tables into a new init_pg_dir, and reduces swapper_pg_dir to the single page it needs to be. This allows us to retain the relationship between swapper_pg_dir, tramp_pg_dir, and swapper_pg_dir, while cleanly separating these from the boot-time page tables. The init_pg_dir holds all of the pgd/pud/pmd/pte levels needed during boot, and all of these levels will be freed when we switch to the swapper_pg_dir, which is initialized by the existing code in paging_init(). Since we start off on the init_pg_dir, we no longer need to allocate a transient page table in paging_init() in order to ensure that swapper_pg_dir isn't live while we initialize it. There should be no functional change as a result of this patch. Signed-off-by: NJun Yao <yaojun8558363@gmail.com> Reviewed-by: NJames Morse <james.morse@arm.com> [Mark: place init_pg_dir after BSS, fold mm changes, commit message] Signed-off-by: NMark Rutland <mark.rutland@arm.com> Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
-
- 21 9月, 2018 1 次提交
-
-
由 James Morse 提交于
include/linux/mmzone.h describes ARCH_HAS_HOLES_MEMORYMODEL as relevant when parts the memmap have been free()d. This would happen on systems where memory is smaller than a sparsemem-section, and the extra struct pages are expensive. pfn_valid() on these systems returns true for the whole sparsemem-section, so an extra memmap_valid_within() check is needed. On arm64 we have nomap memory, so always provide pfn_valid() to test for nomap pages. This means ARCH_HAS_HOLES_MEMORYMODEL's extra checks are already rolled up into pfn_valid(). Remove it. Acked-by: NWill Deacon <will.deacon@arm.com> Signed-off-by: NJames Morse <james.morse@arm.com> Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
-
- 18 9月, 2018 1 次提交
-
-
由 Vladimir Murzin 提交于
Common Not Private (CNP) is a feature of ARMv8.2 extension which allows translation table entries to be shared between different PEs in the same inner shareable domain, so the hardware can use this fact to optimise the caching of such entries in the TLB. CNP occupies one bit in TTBRx_ELy and VTTBR_EL2, which advertises to the hardware that the translation table entries pointed to by this TTBR are the same as every PE in the same inner shareable domain for which the equivalent TTBR also has CNP bit set. In case CNP bit is set but TTBR does not point at the same translation table entries for a given ASID and VMID, then the system is mis-configured, so the results of translations are UNPREDICTABLE. For kernel we postpone setting CNP till all cpus are up and rely on cpufeature framework to 1) patch the code which is sensitive to CNP and 2) update TTBR1_EL1 with CNP bit set. TTBR1_EL1 can be reprogrammed as result of hibernation or cpuidle (via __enable_mmu). For these two cases we restore CnP bit via __cpu_suspend_exit(). There are a few cases we need to care of changes in TTBR0_EL1: - a switch to idmap - software emulated PAN we rule out latter via Kconfig options and for the former we make sure that CNP is set for non-zero ASIDs only. Reviewed-by: NJames Morse <james.morse@arm.com> Reviewed-by: NSuzuki K Poulose <suzuki.poulose@arm.com> Reviewed-by: NCatalin Marinas <catalin.marinas@arm.com> Signed-off-by: NVladimir Murzin <vladimir.murzin@arm.com> [catalin.marinas@arm.com: default y for CONFIG_ARM64_CNP] Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
-
- 15 9月, 2018 1 次提交
-
-
由 Will Deacon 提交于
The cpu errata and feature enable callbacks are only called via their respective arm64_cpu_capabilities structure and therefore shouldn't exist in the global namespace. Move the PAN, RAS and cache maintenance emulation enable callbacks into the same files as their corresponding arm64_cpu_capabilities structures, making them static in the process. Signed-off-by: NWill Deacon <will.deacon@arm.com> Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
-
- 10 9月, 2018 1 次提交
-
-
由 Will Deacon 提交于
Being consistent in our capitalisation for page-table dumps helps when grepping for things like "end". Signed-off-by: NWill Deacon <will.deacon@arm.com> Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
-
- 18 8月, 2018 2 次提交
-
-
由 Marek Szyprowski 提交于
The CMA memory allocator doesn't support standard gfp flags for memory allocation, so there is no point having it as a parameter for dma_alloc_from_contiguous() function. Replace it by a boolean no_warn argument, which covers all the underlaying cma_alloc() function supports. This will help to avoid giving false feeling that this function supports standard gfp flags and callers can pass __GFP_ZERO to get zeroed buffer, what has already been an issue: see commit dd65a941 ("arm64: dma-mapping: clear buffers allocated with FORCE_CONTIGUOUS flag"). Link: http://lkml.kernel.org/r/20180709122020eucas1p21a71b092975cb4a3b9954ffc63f699d1~-sqUFoa-h2939329393eucas1p2Y@eucas1p2.samsung.comSigned-off-by: NMarek Szyprowski <m.szyprowski@samsung.com> Acked-by: NMichał Nazarewicz <mina86@mina86.com> Acked-by: NVlastimil Babka <vbabka@suse.cz> Reviewed-by: NChristoph Hellwig <hch@lst.de> Cc: Laura Abbott <labbott@redhat.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Joonsoo Kim <js1304@gmail.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Souptick Joarder 提交于
Use new return type vm_fault_t for fault handler. For now, this is just documenting that the function returns a VM_FAULT value rather than an errno. Once all instances are converted, vm_fault_t will become a distinct type. Ref-> commit 1c8f4220 ("mm: change return type to vm_fault_t") In this patch all the caller of handle_mm_fault() are changed to return vm_fault_t type. Link: http://lkml.kernel.org/r/20180617084810.GA6730@jordon-HP-15-Notebook-PCSigned-off-by: NSouptick Joarder <jrdr.linux@gmail.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Richard Henderson <rth@twiddle.net> Cc: Tony Luck <tony.luck@intel.com> Cc: Matt Turner <mattst88@gmail.com> Cc: Vineet Gupta <vgupta@synopsys.com> Cc: Russell King <linux@armlinux.org.uk> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Richard Kuo <rkuo@codeaurora.org> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Michal Simek <monstr@monstr.eu> Cc: James Hogan <jhogan@kernel.org> Cc: Ley Foon Tan <lftan@altera.com> Cc: Jonas Bonn <jonas@southpole.se> Cc: James E.J. Bottomley <jejb@parisc-linux.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Palmer Dabbelt <palmer@sifive.com> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Cc: David S. Miller <davem@davemloft.net> Cc: Richard Weinberger <richard@nod.at> Cc: Guan Xuetao <gxt@pku.edu.cn> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: "Levin, Alexander (Sasha Levin)" <alexander.levin@verizon.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 17 8月, 2018 1 次提交
-
-
由 Greg Hackmann 提交于
ARM64's pfn_valid() shifts away the upper PAGE_SHIFT bits of the input before seeing if the PFN is valid. This leads to false positives when some of the upper bits are set, but the lower bits match a valid PFN. For example, the following userspace code looks up a bogus entry in /proc/kpageflags: int pagemap = open("/proc/self/pagemap", O_RDONLY); int pageflags = open("/proc/kpageflags", O_RDONLY); uint64_t pfn, val; lseek64(pagemap, [...], SEEK_SET); read(pagemap, &pfn, sizeof(pfn)); if (pfn & (1UL << 63)) { /* valid PFN */ pfn &= ((1UL << 55) - 1); /* clear flag bits */ pfn |= (1UL << 55); lseek64(pageflags, pfn * sizeof(uint64_t), SEEK_SET); read(pageflags, &val, sizeof(val)); } On ARM64 this causes the userspace process to crash with SIGSEGV rather than reading (1 << KPF_NOPAGE). kpageflags_read() treats the offset as valid, and stable_page_flags() will try to access an address between the user and kernel address ranges. Fixes: c1cc1552 ("arm64: MMU initialisation") Cc: stable@vger.kernel.org Signed-off-by: NGreg Hackmann <ghackmann@google.com> Signed-off-by: NWill Deacon <will.deacon@arm.com>
-
- 09 8月, 2018 1 次提交
-
-
由 Dongjiu Geng 提交于
In order to remove the additional check before calling the ghes_notify_sea(), make stub definition when !CONFIG_ACPI_APEI_SEA. After this cleanup, we can simply call the ghes_notify_sea() to let APEI driver handle the SEA notification. Signed-off-by: NDongjiu Geng <gengdongjiu@huawei.com> Acked-by: NWill Deacon <will.deacon@arm.com> Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
-
- 02 8月, 2018 1 次提交
-
-
由 Linus Torvalds 提交于
Commit 2c4541e2 ("mm: use vma_init() to initialize VMAs on stack and data segments") tried to initialize various left-over ad-hoc vma's "properly", but actually made things worse for the temporary vma's used for TLB flushing. vma_init() doesn't actually initialize all of the vma, just a few fields, so doing something like - struct vm_area_struct vma = { .vm_mm = tlb->mm, }; + struct vm_area_struct vma; + + vma_init(&vma, tlb->mm); was actually very bad: instead of having a nicely initialized vma with every field but "vm_mm" zeroed, you'd have an entirely uninitialized vma with only a couple of fields initialized. And they weren't even fields that the code in question mostly cared about. The flush_tlb_range() function takes a "struct vma" rather than a "struct mm_struct", because a few architectures actually care about what kind of range it is - being able to only do an ITLB flush if it's a range that doesn't have data accesses enabled, for example. And all the normal users already have the vma for doing the range invalidation. But a few people want to call flush_tlb_range() with a range they just made up, so they also end up using a made-up vma. x86 just has a special "flush_tlb_mm_range()" function for this, but other architectures (arm and ia64) do the "use fake vma" thing instead, and thus got caught up in the vma_init() changes. At the same time, the TLB flushing code really doesn't care about most other fields in the vma, so vma_init() is just unnecessary and pointless. This fixes things by having an explicit "this is just an initializer for the TLB flush" initializer macro, which is used by the arm/arm64/ia64 people who mis-use this interface with just a dummy vma. Fixes: 2c4541e2 ("mm: use vma_init() to initialize VMAs on stack and data segments") Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Kirill Shutemov <kirill.shutemov@linux.intel.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: John Stultz <john.stultz@linaro.org> Cc: Hugh Dickins <hughd@google.com> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 27 7月, 2018 2 次提交
-
-
由 Ben Hutchings 提交于
The xen-privcmd driver, which can be modular, calls set_pte_at() which in turn may call __sync_icache_dcache(). The call to __sync_icache_dcache() may be optimised out because it is conditional on !pte_special(), and xen-privcmd calls pte_mkspecial(). But it seems unwise to rely on this optimisation. Fixes: 3ad08765 ("xen/privcmd: add IOCTL_PRIVCMD_MMAP_RESOURCE") Acked-by: NCatalin Marinas <catalin.marinas@arm.com> Signed-off-by: NBen Hutchings <ben@decadent.org.uk> Signed-off-by: NWill Deacon <will.deacon@arm.com>
-
由 Kirill A. Shutemov 提交于
Make sure to initialize all VMAs properly, not only those which come from vm_area_cachep. Link: http://lkml.kernel.org/r/20180724121139.62570-3-kirill.shutemov@linux.intel.comSigned-off-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com> Acked-by: NLinus Torvalds <torvalds@linux-foundation.org> Reviewed-by: NAndrew Morton <akpm@linux-foundation.org> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: <stable@vger.kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 25 7月, 2018 1 次提交
-
-
由 Johannes Weiner 提交于
Arnd reports the following arm64 randconfig build error with the PSI patches that add another page flag: /git/arm-soc/arch/arm64/mm/init.c: In function 'mem_init': /git/arm-soc/include/linux/compiler.h:357:38: error: call to '__compiletime_assert_618' declared with attribute error: BUILD_BUG_ON failed: sizeof(struct page) > (1 << STRUCT_PAGE_MAX_SHIFT) The additional page flag causes other information stored in page->flags to get bumped into their own struct page member: #if SECTIONS_WIDTH+ZONES_WIDTH+NODES_SHIFT+LAST_CPUPID_SHIFT <= BITS_PER_LONG - NR_PAGEFLAGS #define LAST_CPUPID_WIDTH LAST_CPUPID_SHIFT #else #define LAST_CPUPID_WIDTH 0 #endif #if defined(CONFIG_NUMA_BALANCING) && LAST_CPUPID_WIDTH == 0 #define LAST_CPUPID_NOT_IN_PAGE_FLAGS #endif which in turn causes the struct page size to exceed the size set in STRUCT_PAGE_MAX_SHIFT. This value is an an estimate used to size the VMEMMAP page array according to address space and struct page size. However, the check is performed - and triggers here - on a !VMEMMAP config, which consumes an additional 22 page bits for the sparse section id. When VMEMMAP is enabled, those bits are returned, cpupid doesn't need its own member, and the page passes the VMEMMAP check. Restrict that check to the situation it was meant to check: that we are sizing the VMEMMAP page array correctly. Says Arnd: Further experiments show that the build error already existed before, but was only triggered with larger values of CONFIG_NR_CPU and/or CONFIG_NODES_SHIFT that might be used in actual configurations but not in randconfig builds. With longer CPU and node masks, I could recreate the problem with kernels as old as linux-4.7 when arm64 NUMA support got added. Reported-by: NArnd Bergmann <arnd@arndb.de> Tested-by: NArnd Bergmann <arnd@arndb.de> Cc: stable@vger.kernel.org Fixes: 1a2db300 ("arm64, numa: Add NUMA support for arm64 platforms.") Fixes: 3e1907d5 ("arm64: mm: move vmemmap region right below the linear region") Signed-off-by: NJohannes Weiner <hannes@cmpxchg.org> Signed-off-by: NWill Deacon <will.deacon@arm.com>
-
- 12 7月, 2018 1 次提交
-
-
由 Mark Rutland 提交于
Now that we have sysreg_clear_set(), we can consistently use this instead of config_sctlr_el1(). Signed-off-by: NMark Rutland <mark.rutland@arm.com> Reviewed-by: NDave Martin <dave.martin@arm.com> Acked-by: NCatalin Marinas <catalin.marinas@arm.com> Cc: James Morse <james.morse@arm.com> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: NWill Deacon <will.deacon@arm.com>
-
- 06 7月, 2018 4 次提交
-
-
由 Will Deacon 提交于
lkdtm calls flush_icache_range(), which results in an out-of-line call to __flush_icache_range(), which is not exported to modules. Export the symbol to modules to fix this build breakage. Fixes: 3b8c9f1c ("arm64: IPI each CPU after invalidating the I-cache for kernel mappings") Signed-off-by: NWill Deacon <will.deacon@arm.com>
-
由 Sudeep Holla 提交于
Currently numa_clear_node removes both cpu information from the NUMA node cpumap as well as the NUMA node id from the cpu. Similarly numa_store_cpu_info updates both percpu nodeid and NUMA cpumap. However we need to retain the numa node id for the cpu and only remove the cpu information from the numa node cpumap during CPU hotplug out. The same can be extended for hotplugging in the CPU. This patch separates out numa_{add,remove}_cpu from numa_clear_node and numa_store_cpu_info. Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> Reviewed-by: NGanapatrao Kulkarni <ganapatrao.kulkarni@cavium.com> Tested-by: NGanapatrao Kulkarni <ganapatrao.kulkarni@cavium.com> Tested-by: NHanjun Guo <hanjun.guo@linaro.org> Signed-off-by: NSudeep Holla <sudeep.holla@arm.com> Signed-off-by: NWill Deacon <will.deacon@arm.com>
-
由 Chintan Pandya 提交于
arm64 requires break-before-make. Originally, before setting up new pmd/pud entry for huge mapping, in few cases, the modifying pmd/pud entry was still valid and pointing to next level page table as we only clear off leaf PTE in unmap leg. a) This was resulting into stale entry in TLBs (as few TLBs also cache intermediate mapping for performance reasons) b) Also, modifying pmd/pud was the only reference to next level page table and it was getting lost without freeing it. So, page leaks were happening. Implement pud_free_pmd_page() and pmd_free_pte_page() to enforce BBM and also free the leaking page tables. Implementation requires, 1) Clearing off the current pud/pmd entry 2) Invalidation of TLB 3) Freeing of the un-used next level page tables Reviewed-by: NWill Deacon <will.deacon@arm.com> Signed-off-by: NChintan Pandya <cpandya@codeaurora.org> Signed-off-by: NWill Deacon <will.deacon@arm.com>
-
由 Will Deacon 提交于
When invalidating the instruction cache for a kernel mapping via flush_icache_range(), it is also necessary to flush the pipeline for other CPUs so that instructions fetched into the pipeline before the I-cache invalidation are discarded. For example, if module 'foo' is unloaded and then module 'bar' is loaded into the same area of memory, a CPU could end up executing instructions from 'foo' when branching into 'bar' if these instructions were fetched into the pipeline before 'foo' was unloaded. Whilst this is highly unlikely to occur in practice, particularly as any exception acts as a context-synchronizing operation, following the letter of the architecture requires us to execute an ISB on each CPU in order for the new instruction stream to be visible. Acked-by: NCatalin Marinas <catalin.marinas@arm.com> Signed-off-by: NWill Deacon <will.deacon@arm.com>
-
- 05 7月, 2018 1 次提交
-
-
由 Chintan Pandya 提交于
The following kernel panic was observed on ARM64 platform due to a stale TLB entry. 1. ioremap with 4K size, a valid pte page table is set. 2. iounmap it, its pte entry is set to 0. 3. ioremap the same address with 2M size, update its pmd entry with a new value. 4. CPU may hit an exception because the old pmd entry is still in TLB, which leads to a kernel panic. Commit b6bdb751 ("mm/vmalloc: add interfaces to free unmapped page table") has addressed this panic by falling to pte mappings in the above case on ARM64. To support pmd mappings in all cases, TLB purge needs to be performed in this case on ARM64. Add a new arg, 'addr', to pud_free_pmd_page() and pmd_free_pte_page() so that TLB purge can be added later in seprate patches. [toshi.kani@hpe.com: merge changes, rewrite patch description] Fixes: 28ee90fe ("x86/mm: implement free pmd/pte page interfaces") Signed-off-by: NChintan Pandya <cpandya@codeaurora.org> Signed-off-by: NToshi Kani <toshi.kani@hpe.com> Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Cc: mhocko@suse.com Cc: akpm@linux-foundation.org Cc: hpa@zytor.com Cc: linux-mm@kvack.org Cc: linux-arm-kernel@lists.infradead.org Cc: Will Deacon <will.deacon@arm.com> Cc: Joerg Roedel <joro@8bytes.org> Cc: stable@vger.kernel.org Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Michal Hocko <mhocko@suse.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: <stable@vger.kernel.org> Link: https://lkml.kernel.org/r/20180627141348.21777-3-toshi.kani@hpe.com
-
- 02 7月, 2018 1 次提交
-
-
由 Peng Donglin 提交于
Use DEFINE_SHOW_ATTRIBUTE macro to simplify the code. Signed-off-by: NPeng Donglin <dolinux.peng@gmail.com> Signed-off-by: NWill Deacon <will.deacon@arm.com>
-
- 23 6月, 2018 1 次提交
-
-
由 Will Deacon 提交于
When rewriting swapper using nG mappings, we must performance cache maintenance around each page table access in order to avoid coherency problems with the host's cacheable alias under KVM. To ensure correct ordering of the maintenance with respect to Device memory accesses made with the Stage-1 MMU disabled, DMBs need to be added between the maintenance and the corresponding memory access. This patch adds a missing DMB between writing a new page table entry and performing a clean+invalidate on the same line. Fixes: f992b4df ("arm64: kpti: Add ->enable callback to remap swapper using nG mappings") Cc: <stable@vger.kernel.org> # 4.16.x- Acked-by: NMark Rutland <mark.rutland@arm.com> Signed-off-by: NWill Deacon <will.deacon@arm.com> Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
-
- 19 6月, 2018 1 次提交
-
-
由 Marek Szyprowski 提交于
dma_alloc_*() buffers might be exposed to userspace via mmap() call, so they should be cleared on allocation. In case of IOMMU-based dma-mapping implementation such buffer clearing was missing in the code path for DMA_ATTR_FORCE_CONTIGUOUS flag handling, because dma_alloc_from_contiguous() doesn't honor __GFP_ZERO flag. This patch fixes this issue. For more information on clearing buffers allocated by dma_alloc_* functions, see commit 6829e274 ("arm64: dma-mapping: always clear allocated buffers"). Fixes: 44176bb3 ("arm64: Add support for DMA_ATTR_FORCE_CONTIGUOUS to IOMMU") Signed-off-by: NMarek Szyprowski <m.szyprowski@samsung.com> Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
-
- 15 6月, 2018 1 次提交
-
-
由 Stefan Agner 提交于
With PHYS_ADDR_MAX there is now a type safe variant for all bits set. Make use of it. Patch created using a semantic patch as follows: // <smpl> @@ typedef phys_addr_t; @@ -(phys_addr_t)ULLONG_MAX +PHYS_ADDR_MAX // </smpl> Link: http://lkml.kernel.org/r/20180419214204.19322-1-stefan@agner.chSigned-off-by: NStefan Agner <stefan@agner.ch> Reviewed-by: NAndrew Morton <akpm@linux-foundation.org> Acked-by: NArd Biesheuvel <ard.biesheuvel@linaro.org> Acked-by: Catalin Marinas <catalin.marinas@arm.com> [arm64] Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 13 6月, 2018 1 次提交
-
-
由 Kees Cook 提交于
The kzalloc() function has a 2-factor argument form, kcalloc(). This patch replaces cases of: kzalloc(a * b, gfp) with: kcalloc(a * b, gfp) as well as handling cases of: kzalloc(a * b * c, gfp) with: kzalloc(array3_size(a, b, c), gfp) as it's slightly less ugly than: kzalloc_array(array_size(a, b), c, gfp) This does, however, attempt to ignore constant size factors like: kzalloc(4 * 1024, gfp) though any constants defined via macros get caught up in the conversion. Any factors with a sizeof() of "unsigned char", "char", and "u8" were dropped, since they're redundant. The Coccinelle script used for this was: // Fix redundant parens around sizeof(). @@ type TYPE; expression THING, E; @@ ( kzalloc( - (sizeof(TYPE)) * E + sizeof(TYPE) * E , ...) | kzalloc( - (sizeof(THING)) * E + sizeof(THING) * E , ...) ) // Drop single-byte sizes and redundant parens. @@ expression COUNT; typedef u8; typedef __u8; @@ ( kzalloc( - sizeof(u8) * (COUNT) + COUNT , ...) | kzalloc( - sizeof(__u8) * (COUNT) + COUNT , ...) | kzalloc( - sizeof(char) * (COUNT) + COUNT , ...) | kzalloc( - sizeof(unsigned char) * (COUNT) + COUNT , ...) | kzalloc( - sizeof(u8) * COUNT + COUNT , ...) | kzalloc( - sizeof(__u8) * COUNT + COUNT , ...) | kzalloc( - sizeof(char) * COUNT + COUNT , ...) | kzalloc( - sizeof(unsigned char) * COUNT + COUNT , ...) ) // 2-factor product with sizeof(type/expression) and identifier or constant. @@ type TYPE; expression THING; identifier COUNT_ID; constant COUNT_CONST; @@ ( - kzalloc + kcalloc ( - sizeof(TYPE) * (COUNT_ID) + COUNT_ID, sizeof(TYPE) , ...) | - kzalloc + kcalloc ( - sizeof(TYPE) * COUNT_ID + COUNT_ID, sizeof(TYPE) , ...) | - kzalloc + kcalloc ( - sizeof(TYPE) * (COUNT_CONST) + COUNT_CONST, sizeof(TYPE) , ...) | - kzalloc + kcalloc ( - sizeof(TYPE) * COUNT_CONST + COUNT_CONST, sizeof(TYPE) , ...) | - kzalloc + kcalloc ( - sizeof(THING) * (COUNT_ID) + COUNT_ID, sizeof(THING) , ...) | - kzalloc + kcalloc ( - sizeof(THING) * COUNT_ID + COUNT_ID, sizeof(THING) , ...) | - kzalloc + kcalloc ( - sizeof(THING) * (COUNT_CONST) + COUNT_CONST, sizeof(THING) , ...) | - kzalloc + kcalloc ( - sizeof(THING) * COUNT_CONST + COUNT_CONST, sizeof(THING) , ...) ) // 2-factor product, only identifiers. @@ identifier SIZE, COUNT; @@ - kzalloc + kcalloc ( - SIZE * COUNT + COUNT, SIZE , ...) // 3-factor product with 1 sizeof(type) or sizeof(expression), with // redundant parens removed. @@ expression THING; identifier STRIDE, COUNT; type TYPE; @@ ( kzalloc( - sizeof(TYPE) * (COUNT) * (STRIDE) + array3_size(COUNT, STRIDE, sizeof(TYPE)) , ...) | kzalloc( - sizeof(TYPE) * (COUNT) * STRIDE + array3_size(COUNT, STRIDE, sizeof(TYPE)) , ...) | kzalloc( - sizeof(TYPE) * COUNT * (STRIDE) + array3_size(COUNT, STRIDE, sizeof(TYPE)) , ...) | kzalloc( - sizeof(TYPE) * COUNT * STRIDE + array3_size(COUNT, STRIDE, sizeof(TYPE)) , ...) | kzalloc( - sizeof(THING) * (COUNT) * (STRIDE) + array3_size(COUNT, STRIDE, sizeof(THING)) , ...) | kzalloc( - sizeof(THING) * (COUNT) * STRIDE + array3_size(COUNT, STRIDE, sizeof(THING)) , ...) | kzalloc( - sizeof(THING) * COUNT * (STRIDE) + array3_size(COUNT, STRIDE, sizeof(THING)) , ...) | kzalloc( - sizeof(THING) * COUNT * STRIDE + array3_size(COUNT, STRIDE, sizeof(THING)) , ...) ) // 3-factor product with 2 sizeof(variable), with redundant parens removed. @@ expression THING1, THING2; identifier COUNT; type TYPE1, TYPE2; @@ ( kzalloc( - sizeof(TYPE1) * sizeof(TYPE2) * COUNT + array3_size(COUNT, sizeof(TYPE1), sizeof(TYPE2)) , ...) | kzalloc( - sizeof(TYPE1) * sizeof(THING2) * (COUNT) + array3_size(COUNT, sizeof(TYPE1), sizeof(TYPE2)) , ...) | kzalloc( - sizeof(THING1) * sizeof(THING2) * COUNT + array3_size(COUNT, sizeof(THING1), sizeof(THING2)) , ...) | kzalloc( - sizeof(THING1) * sizeof(THING2) * (COUNT) + array3_size(COUNT, sizeof(THING1), sizeof(THING2)) , ...) | kzalloc( - sizeof(TYPE1) * sizeof(THING2) * COUNT + array3_size(COUNT, sizeof(TYPE1), sizeof(THING2)) , ...) | kzalloc( - sizeof(TYPE1) * sizeof(THING2) * (COUNT) + array3_size(COUNT, sizeof(TYPE1), sizeof(THING2)) , ...) ) // 3-factor product, only identifiers, with redundant parens removed. @@ identifier STRIDE, SIZE, COUNT; @@ ( kzalloc( - (COUNT) * STRIDE * SIZE + array3_size(COUNT, STRIDE, SIZE) , ...) | kzalloc( - COUNT * (STRIDE) * SIZE + array3_size(COUNT, STRIDE, SIZE) , ...) | kzalloc( - COUNT * STRIDE * (SIZE) + array3_size(COUNT, STRIDE, SIZE) , ...) | kzalloc( - (COUNT) * (STRIDE) * SIZE + array3_size(COUNT, STRIDE, SIZE) , ...) | kzalloc( - COUNT * (STRIDE) * (SIZE) + array3_size(COUNT, STRIDE, SIZE) , ...) | kzalloc( - (COUNT) * STRIDE * (SIZE) + array3_size(COUNT, STRIDE, SIZE) , ...) | kzalloc( - (COUNT) * (STRIDE) * (SIZE) + array3_size(COUNT, STRIDE, SIZE) , ...) | kzalloc( - COUNT * STRIDE * SIZE + array3_size(COUNT, STRIDE, SIZE) , ...) ) // Any remaining multi-factor products, first at least 3-factor products, // when they're not all constants... @@ expression E1, E2, E3; constant C1, C2, C3; @@ ( kzalloc(C1 * C2 * C3, ...) | kzalloc( - (E1) * E2 * E3 + array3_size(E1, E2, E3) , ...) | kzalloc( - (E1) * (E2) * E3 + array3_size(E1, E2, E3) , ...) | kzalloc( - (E1) * (E2) * (E3) + array3_size(E1, E2, E3) , ...) | kzalloc( - E1 * E2 * E3 + array3_size(E1, E2, E3) , ...) ) // And then all remaining 2 factors products when they're not all constants, // keeping sizeof() as the second factor argument. @@ expression THING, E1, E2; type TYPE; constant C1, C2, C3; @@ ( kzalloc(sizeof(THING) * C2, ...) | kzalloc(sizeof(TYPE) * C2, ...) | kzalloc(C1 * C2 * C3, ...) | kzalloc(C1 * C2, ...) | - kzalloc + kcalloc ( - sizeof(TYPE) * (E2) + E2, sizeof(TYPE) , ...) | - kzalloc + kcalloc ( - sizeof(TYPE) * E2 + E2, sizeof(TYPE) , ...) | - kzalloc + kcalloc ( - sizeof(THING) * (E2) + E2, sizeof(THING) , ...) | - kzalloc + kcalloc ( - sizeof(THING) * E2 + E2, sizeof(THING) , ...) | - kzalloc + kcalloc ( - (E1) * E2 + E1, E2 , ...) | - kzalloc + kcalloc ( - (E1) * (E2) + E1, E2 , ...) | - kzalloc + kcalloc ( - E1 * E2 + E1, E2 , ...) ) Signed-off-by: NKees Cook <keescook@chromium.org>
-
- 24 5月, 2018 1 次提交
-
-
由 Laura Abbott 提交于
Commit 15122ee2 ("arm64: Enforce BBM for huge IO/VMAP mappings") disallowed block mappings for ioremap since that code does not honor break-before-make. The same APIs are also used for permission updating though and the extra checks prevent the permission updates from happening, even though this should be permitted. This results in read-only permissions not being fully applied. Visibly, this can occasionaly be seen as a failure on the built in rodata test when the test data ends up in a section or as an odd RW gap on the page table dump. Fix this by using pgattr_change_is_safe instead of p*d_present for determining if the change is permitted. Reviewed-by: NKees Cook <keescook@chromium.org> Tested-by: NPeter Robinson <pbrobinson@gmail.com> Reported-by: NPeter Robinson <pbrobinson@gmail.com> Fixes: 15122ee2 ("arm64: Enforce BBM for huge IO/VMAP mappings") Signed-off-by: NLaura Abbott <labbott@redhat.com> Signed-off-by: NWill Deacon <will.deacon@arm.com>
-
- 23 5月, 2018 3 次提交
-
-
由 Mark Rutland 提交于
In do_page_fault(), we handle some kernel faults early, and simply die() with a message. For faults handled later, we dump the faulting address, decode the ESR, walk the page tables, and perform a number of steps to ensure that this data is reported. Let's unify the handling of fatal kernel faults with a new die_kernel_fault() helper, handling all of these details. This is largely the same as the existing logic in __do_kernel_fault(), except that addresses are consistently padded to 16 hex characters, as would be expected for a 64-bit address. The messages currently logged in do_page_fault are adjusted to fit into the die_kernel_fault() message template. Acked-by: NWill Deacon <will.deacon@arm.com> Signed-off-by: NMark Rutland <mark.rutland@arm.com> Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
-
由 Mark Rutland 提交于
The naming of is_permission_fault() makes it sound like it should return true for permission faults from EL0, but by design, it only does so for faults from EL1. Let's make this clear by dropping el1 in the name, as we do for is_el1_instruction_abort(). Acked-by: NWill Deacon <will.deacon@arm.com> Signed-off-by: NMark Rutland <mark.rutland@arm.com> Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
-
由 Peter Maydell 提交于
If userspace faults on a kernel address, handing them the raw ESR value on the sigframe as part of the delivered signal can leak data useful to attackers who are using information about the underlying hardware fault type (e.g. translation vs permission) as a mechanism to defeat KASLR. However there are also legitimate uses for the information provided in the ESR -- notably the GCC and LLVM sanitizers use this to report whether wild pointer accesses by the application are reads or writes (since a wild write is a more serious bug than a wild read), so we don't want to drop the ESR information entirely. For faulting addresses in the kernel, sanitize the ESR. We choose to present userspace with the illusion that there is nothing mapped in the kernel's part of the address space at all, by reporting all faults as level 0 translation faults taken to EL1. These fields are safe to pass through to userspace as they depend only on the instruction that userspace used to provoke the fault: EC IL (always) ISV CM WNR (for all data aborts) All the other fields in ESR except DFSC are architecturally RES0 for an L0 translation fault taken to EL1, so can be zeroed out without confusing userspace. The illusion is not entirely perfect, as there is a tiny wrinkle where we will report an alignment fault that was not due to the memory type (for instance a LDREX to an unaligned address) as a translation fault, whereas if you do this on real unmapped memory the alignment fault takes precedence. This is not likely to trip anybody up in practice, as the only users we know of for the ESR information who care about the behaviour for kernel addresses only really want to know about the WnR bit. Signed-off-by: NPeter Maydell <peter.maydell@linaro.org> Signed-off-by: NWill Deacon <will.deacon@arm.com>
-
- 15 5月, 2018 1 次提交
-
-
由 Catalin Marinas 提交于
This patch increases the ARCH_DMA_MINALIGN to 128 so that it covers the currently known Cache Writeback Granule (CTR_EL0.CWG) on arm64 and moves the fallback in cache_line_size() from L1_CACHE_BYTES to this constant. In addition, it warns (and taints) if the CWG is larger than ARCH_DMA_MINALIGN as this is not safe with non-coherent DMA. Cc: Will Deacon <will.deacon@arm.com> Cc: Robin Murphy <robin.murphy@arm.com> Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
-
- 08 5月, 2018 1 次提交
-
-
由 Christoph Hellwig 提交于
Most mainstream architectures are using 65536 entries, so lets stick to that. If someone is really desperate to override it that can still be done through <asm/dma-mapping.h>, but I'd rather see a really good rationale for that. dma_debug_init is now called as a core_initcall, which for many architectures means much earlier, and provides dma-debug functionality earlier in the boot process. This should be safe as it only relies on the memory allocator already being available. Signed-off-by: NChristoph Hellwig <hch@lst.de> Acked-by: NMarek Szyprowski <m.szyprowski@samsung.com> Reviewed-by: NRobin Murphy <robin.murphy@arm.com>
-
- 01 5月, 2018 1 次提交
-
-
由 CHANDAN VN 提交于
INITRD reserved area entry is not removed from memblock even though initrd reserved area is freed. After freeing the memory it is released from memblock. The same can be checked from /sys/kernel/debug/memblock/reserved. The patch makes sure that the initrd entry is removed from memblock when keepinitrd is not enabled. The patch only affects accounting and debugging. This does not fix any memory leak. Acked-by: NLaura Abbott <labbott@redhat.com> Signed-off-by: NCHANDAN VN <chandan.vn@samsung.com> Signed-off-by: NWill Deacon <will.deacon@arm.com>
-
- 25 4月, 2018 1 次提交
-
-
由 Eric W. Biederman 提交于
Call clear_siginfo to ensure every stack allocated siginfo is properly initialized before being passed to the signal sending functions. Note: It is not safe to depend on C initializers to initialize struct siginfo on the stack because C is allowed to skip holes when initializing a structure. The initialization of struct siginfo in tracehook_report_syscall_exit was moved from the helper user_single_step_siginfo into tracehook_report_syscall_exit itself, to make it clear that the local variable siginfo gets fully initialized. In a few cases the scope of struct siginfo has been reduced to make it clear that siginfo siginfo is not used on other paths in the function in which it is declared. Instances of using memset to initialize siginfo have been replaced with calls clear_siginfo for clarity. Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>
-
- 24 4月, 2018 1 次提交
-
-
由 Shaokun Zhang 提交于
The addr parameter isn't used for anything. Let's simplify and get rid of it, like arm. Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: NShaokun Zhang <zhangshaokun@hisilicon.com> Signed-off-by: NWill Deacon <will.deacon@arm.com>
-
- 17 4月, 2018 1 次提交
-
-
由 Mark Rutland 提交于
In arm64's kasan_init(), we use pfn_to_nid() to find the NUMA node a span of memory is in, hoping to allocate shadow from the same NUMA node. However, at this point, the page array has not been initialized, and thus this is bogus. Since commit: f165b378 ("mm: uninitialized struct page poisoning sanity") ... accessing fields of the page array results in a boot time Oops(), highlighting this problem: [ 0.000000] Unable to handle kernel paging request at virtual address dfff200000000000 [ 0.000000] Mem abort info: [ 0.000000] ESR = 0x96000004 [ 0.000000] Exception class = DABT (current EL), IL = 32 bits [ 0.000000] SET = 0, FnV = 0 [ 0.000000] EA = 0, S1PTW = 0 [ 0.000000] Data abort info: [ 0.000000] ISV = 0, ISS = 0x00000004 [ 0.000000] CM = 0, WnR = 0 [ 0.000000] [dfff200000000000] address between user and kernel address ranges [ 0.000000] Internal error: Oops: 96000004 [#1] PREEMPT SMP [ 0.000000] Modules linked in: [ 0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted 4.16.0-07317-gf165b378 #42 [ 0.000000] Hardware name: ARM Juno development board (r1) (DT) [ 0.000000] pstate: 80000085 (Nzcv daIf -PAN -UAO) [ 0.000000] pc : __asan_load8+0x8c/0xa8 [ 0.000000] lr : __dump_page+0x3c/0x3b8 [ 0.000000] sp : ffff2000099b7ca0 [ 0.000000] x29: ffff2000099b7ca0 x28: ffff20000a1762c0 [ 0.000000] x27: ffff7e0000000000 x26: ffff2000099dd000 [ 0.000000] x25: ffff200009a3f960 x24: ffff200008f9c38c [ 0.000000] x23: ffff20000a9d3000 x22: ffff200009735430 [ 0.000000] x21: fffffffffffffffe x20: ffff7e0001e50420 [ 0.000000] x19: ffff7e0001e50400 x18: 0000000000001840 [ 0.000000] x17: ffffffffffff8270 x16: 0000000000001840 [ 0.000000] x15: 0000000000001920 x14: 0000000000000004 [ 0.000000] x13: 0000000000000000 x12: 0000000000000800 [ 0.000000] x11: 1ffff0012d0f89ff x10: ffff10012d0f89ff [ 0.000000] x9 : 0000000000000000 x8 : ffff8009687c5000 [ 0.000000] x7 : 0000000000000000 x6 : ffff10000f282000 [ 0.000000] x5 : 0000000000000040 x4 : fffffffffffffffe [ 0.000000] x3 : 0000000000000000 x2 : dfff200000000000 [ 0.000000] x1 : 0000000000000005 x0 : 0000000000000000 [ 0.000000] Process swapper (pid: 0, stack limit = 0x (ptrval)) [ 0.000000] Call trace: [ 0.000000] __asan_load8+0x8c/0xa8 [ 0.000000] __dump_page+0x3c/0x3b8 [ 0.000000] dump_page+0xc/0x18 [ 0.000000] kasan_init+0x2e8/0x5a8 [ 0.000000] setup_arch+0x294/0x71c [ 0.000000] start_kernel+0xdc/0x500 [ 0.000000] Code: aa0403e0 9400063c 17ffffee d343fc00 (38e26800) [ 0.000000] ---[ end trace 67064f0e9c0cc338 ]--- [ 0.000000] Kernel panic - not syncing: Attempted to kill the idle task! [ 0.000000] ---[ end Kernel panic - not syncing: Attempted to kill the idle task! ]--- Let's fix this by using early_pfn_to_nid(), as other architectures do in their kasan init code. Note that early_pfn_to_nid acquires the nid from the memblock array, which we iterate over in kasan_init(), so this should be fine. Signed-off-by: NMark Rutland <mark.rutland@arm.com> Fixes: 39d114dd ("arm64: add KASAN support") Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
-
- 12 4月, 2018 1 次提交
-
-
由 Kees Cook 提交于
Patch series "exec: Pin stack limit during exec". Attempts to solve problems with the stack limit changing during exec continue to be frustrated[1][2]. In addition to the specific issues around the Stack Clash family of flaws, Andy Lutomirski pointed out[3] other places during exec where the stack limit is used and is assumed to be unchanging. Given the many places it gets used and the fact that it can be manipulated/raced via setrlimit() and prlimit(), I think the only way to handle this is to move away from the "current" view of the stack limit and instead attach it to the bprm, and plumb this down into the functions that need to know the stack limits. This series implements the approach. [1] 04e35f44 ("exec: avoid RLIMIT_STACK races with prlimit()") [2] 779f4e1c ("Revert "exec: avoid RLIMIT_STACK races with prlimit()"") [3] to security@kernel.org, "Subject: existing rlimit races?" This patch (of 3): Since it is possible that the stack rlimit can change externally during exec (either via another thread calling setrlimit() or another process calling prlimit()), provide a way to pass the rlimit down into the per-architecture mm layout functions so that the rlimit can stay in the bprm structure instead of sitting in the signal structure until exec is finalized. Link: http://lkml.kernel.org/r/1518638796-20819-2-git-send-email-keescook@chromium.orgSigned-off-by: NKees Cook <keescook@chromium.org> Cc: Michal Hocko <mhocko@kernel.org> Cc: Ben Hutchings <ben@decadent.org.uk> Cc: Willy Tarreau <w@1wt.eu> Cc: Hugh Dickins <hughd@google.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: "Jason A. Donenfeld" <Jason@zx2c4.com> Cc: Rik van Riel <riel@redhat.com> Cc: Laura Abbott <labbott@redhat.com> Cc: Greg KH <greg@kroah.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Ben Hutchings <ben.hutchings@codethink.co.uk> Cc: Brad Spengler <spender@grsecurity.net> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 27 3月, 2018 2 次提交
-
-
由 Will Deacon 提交于
This reverts commit 1f85b42a. The internal dma-direct.h API has changed in -next, which collides with us trying to use it to manage non-coherent DMA devices on systems with unreasonably large cache writeback granules. This isn't at all trivial to resolve, so revert our changes for now and we can revisit this after the merge window. Effectively, this just restores our behaviour back to that of 4.16. Signed-off-by: NWill Deacon <will.deacon@arm.com>
-
由 Suzuki K Poulose 提交于
We enable hardware DBM bit in a capable CPU, very early in the boot via __cpu_setup. This doesn't give us a flexibility of optionally disable the feature, as the clearing the bit is a bit costly as the TLB can cache the settings. Instead, we delay enabling the feature until the CPU is brought up into the kernel. We use the feature capability mechanism to handle it. The hardware DBM is a non-conflicting feature. i.e, the kernel can safely run with a mix of CPUs with some using the feature and the others don't. So, it is safe for a late CPU to have this capability and enable it, even if the active CPUs don't. To get this handled properly by the infrastructure, we unconditionally set the capability and only enable it on CPUs which really have the feature. Also, we print the feature detection from the "matches" call back to make sure we don't mislead the user when none of the CPUs could use the feature. Cc: Catalin Marinas <catalin.marinas@arm.com> Reviewed-by: NDave Martin <dave.martin@arm.com> Signed-off-by: NSuzuki K Poulose <suzuki.poulose@arm.com> Signed-off-by: NWill Deacon <will.deacon@arm.com>
-