- 27 12月, 2016 1 次提交
-
-
由 Al Viro 提交于
Split asm-only parts of arm64 uaccess.h into a new header and use that from *.S. Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
- 25 12月, 2016 1 次提交
-
-
由 Linus Torvalds 提交于
This was entirely automated, using the script by Al: PATT='^[[:blank:]]*#[[:blank:]]*include[[:blank:]]*<asm/uaccess.h>' sed -i -e "s!$PATT!#include <linux/uaccess.h>!" \ $(git grep -l "$PATT"|grep -v ^include/linux/uaccess.h) to do the replacement at the end of the merge window. Requested-by: NAl Viro <viro@zeniv.linux.org.uk> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 22 11月, 2016 1 次提交
-
-
由 Catalin Marinas 提交于
This patch moves the directly coded alternatives for turning PAN on/off into separate uaccess_{enable,disable} macros or functions. The asm macros take a few arguments which will be used in subsequent patches. Note that any (unlikely) access that the compiler might generate between uaccess_enable() and uaccess_disable(), other than those explicitly specified by the user access code, will not be protected by PAN. Cc: Will Deacon <will.deacon@arm.com> Cc: James Morse <james.morse@arm.com> Cc: Kees Cook <keescook@chromium.org> Reviewed-by: NMark Rutland <mark.rutland@arm.com> Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
-
- 16 9月, 2016 1 次提交
-
-
由 Al Viro 提交于
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
- 12 9月, 2016 1 次提交
-
-
由 Mark Rutland 提交于
Make use of the new alternative_if and alternative_else_nop_endif and get rid of our homebew NOP sleds, making the code simpler to read. Note that for cpu_do_switch_mm the ret has been moved out of the alternative sequence, and in the default case there will be three additional NOPs executed. Signed-off-by: NMark Rutland <mark.rutland@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: James Morse <james.morse@arm.com> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: NWill Deacon <will.deacon@arm.com>
-
- 27 7月, 2016 1 次提交
-
-
由 Kees Cook 提交于
Enables CONFIG_HARDENED_USERCOPY checks on arm64. As done by KASAN in -next, renames the low-level functions to __arch_copy_*_user() so a static inline can do additional work before the copy. Signed-off-by: NKees Cook <keescook@chromium.org>
-
- 21 6月, 2016 1 次提交
-
-
由 Yang Shi 提交于
The upstream commit 1771c6e1 ("x86/kasan: instrument user memory access API") added KASAN instrument to x86 user memory access API, so added such instrument to ARM64 too. Define __copy_to/from_user in C in order to add kasan_check_read/write call, rename assembly implementation to __arch_copy_to/from_user. Tested by test_kasan module. Acked-by: NAndrey Ryabinin <aryabinin@virtuozzo.com> Reviewed-by: NMark Rutland <mark.rutland@arm.com> Tested-by: NMark Rutland <mark.rutland@arm.com> Signed-off-by: NYang Shi <yang.shi@linaro.org> Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
-
- 05 3月, 2016 1 次提交
-
-
由 Adam Buchbinder 提交于
Signed-off-by: NAdam Buchbinder <adam.buchbinder@gmail.com> Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
-
- 27 2月, 2016 1 次提交
-
-
由 Ard Biesheuvel 提交于
The LSE atomics implementation uses runtime patching to patch in calls to out of line non-LSE atomics implementations on cores that lack hardware support for LSE. To avoid paying the overhead cost of a function call even if no call ends up being made, the bl instruction is kept invisible to the compiler, and the out of line implementations preserve all registers, not just the ones that they are required to preserve as per the AAPCS64. However, commit fd045f6c ("arm64: add support for module PLTs") added support for routing branch instructions via veneers if the branch target offset exceeds the range of the ordinary relative branch instructions. Since this deals with jump and call instructions that are exposed to ELF relocations, the PLT code uses x16 to hold the address of the branch target when it performs an indirect branch-to-register, something which is explicitly allowed by the AAPCS64 (and ordinary compiler generated code does not expect register x16 or x17 to retain their values across a bl instruction). Since the lse runtime patched bl instructions don't adhere to the AAPCS64, they don't deal with this clobbering of registers x16 and x17. So add them to the clobber list of the asm() statements that perform the call instructions, and drop x16 and x17 from the list of registers that are callee saved in the out of line non-LSE implementations. In addition, since we have given these functions two scratch registers, they no longer need to stack/unstack temp registers. Signed-off-by: NArd Biesheuvel <ard.biesheuvel@linaro.org> [will: factored clobber list into #define, updated Makefile comment] Signed-off-by: NWill Deacon <will.deacon@arm.com> Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
-
- 19 2月, 2016 2 次提交
-
-
由 James Morse 提交于
If a CPU supports both Privileged Access Never (PAN) and User Access Override (UAO), we don't need to disable/re-enable PAN round all copy_to_user() like calls. UAO alternatives cause these calls to use the 'unprivileged' load/store instructions, which are overridden to be the privileged kind when fs==KERNEL_DS. This patch changes the copy_to_user() calls to have their PAN toggling depend on a new composite 'feature' ARM64_ALT_PAN_NOT_UAO. If both features are detected, PAN will be enabled, but the copy_to_user() alternatives will not be applied. This means PAN will be enabled all the time for these functions. If only PAN is detected, the toggling will be enabled as normal. This will save the time taken to disable/re-enable PAN, and allow us to catch copy_to_user() accesses that occur with fs==KERNEL_DS. Futex and swp-emulation code continue to hang their PAN toggling code on ARM64_HAS_PAN. Signed-off-by: NJames Morse <james.morse@arm.com> Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
-
由 James Morse 提交于
'User Access Override' is a new ARMv8.2 feature which allows the unprivileged load and store instructions to be overridden to behave in the normal way. This patch converts {get,put}_user() and friends to use ldtr*/sttr* instructions - so that they can only access EL0 memory, then enables UAO when fs==KERNEL_DS so that these functions can access kernel memory. This allows user space's read/write permissions to be checked against the page tables, instead of testing addr<USER_DS, then using the kernel's read/write permissions. Signed-off-by: NJames Morse <james.morse@arm.com> [catalin.marinas@arm.com: move uao_thread_switch() above dsb()] Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
-
- 16 2月, 2016 3 次提交
-
-
由 Andrew Pinski 提交于
On ThunderX T88 pass 1 and pass 2, there is no hardware prefetching so we need to patch in explicit software prefetching instructions Prefetching improves this code by 60% over the original code and 2x over the code without prefetching for the affected hardware using the benchmark code at https://github.com/apinski-cavium/copy_page_benchmarkSigned-off-by: NAndrew Pinski <apinski@cavium.com> Signed-off-by: NWill Deacon <will.deacon@arm.com> Tested-by: NAndrew Pinski <apinski@cavium.com> Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
-
由 Will Deacon 提交于
We want to avoid lots of different copy_page implementations, settling for something that is "good enough" everywhere and hopefully easy to understand and maintain whilst we're at it. This patch reworks our copy_page implementation based on discussions with Cavium on the list and benchmarking on Cortex-A processors so that: - The loop is unrolled to copy 128 bytes per iteration - The reads are offset so that we read from the next 128-byte block in the same iteration that we store the previous block - Explicit prefetch instructions are removed for now, since they hurt performance on CPUs with hardware prefetching - The loop exit condition is calculated at the start of the loop Signed-off-by: NWill Deacon <will.deacon@arm.com> Tested-by: NAndrew Pinski <apinski@cavium.com> Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
-
由 Thierry Reding 提交于
Changes introduced in the upstream version of libfdt pulled in by commit 91feabc2 ("scripts/dtc: Update to upstream commit b06e55c88b9b") use the strnlen() function, which isn't currently available to the EFI name- space. Add it to the EFI namespace to avoid a linker error. Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Rob Herring <robh@kernel.org> Reviewed-by: NArd Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: NThierry Reding <treding@nvidia.com> Signed-off-by: NWill Deacon <will.deacon@arm.com>
-
- 13 10月, 2015 1 次提交
-
-
由 Andrey Ryabinin 提交于
This patch adds arch specific code for kernel address sanitizer (see Documentation/kasan.txt). 1/8 of kernel addresses reserved for shadow memory. There was no big enough hole for this, so virtual addresses for shadow were stolen from vmalloc area. At early boot stage the whole shadow region populated with just one physical page (kasan_zero_page). Later, this page reused as readonly zero shadow for some memory that KASan currently don't track (vmalloc). After mapping the physical memory, pages for shadow memory are allocated and mapped. Functions like memset/memmove/memcpy do a lot of memory accesses. If bad pointer passed to one of these function it is important to catch this. Compiler's instrumentation cannot do this since these functions are written in assembly. KASan replaces memory functions with manually instrumented variants. Original functions declared as weak symbols so strong definitions in mm/kasan/kasan.c could replace them. Original functions have aliases with '__' prefix in name, so we could call non-instrumented variant if needed. Some files built without kasan instrumentation (e.g. mm/slub.c). Original mem* function replaced (via #define) with prefixed variants to disable memory access checks for such files. Signed-off-by: NAndrey Ryabinin <ryabinin.a.a@gmail.com> Tested-by: NLinus Walleij <linus.walleij@linaro.org> Reviewed-by: NCatalin Marinas <catalin.marinas@arm.com> Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
-
- 12 10月, 2015 1 次提交
-
-
由 Ard Biesheuvel 提交于
For more control over which functions are called with the MMU off or with the UEFI 1:1 mapping active, annotate some assembler routines as position independent. This is done by introducing ENDPIPROC(), which replaces the ENDPROC() declaration of those routines. Signed-off-by: NArd Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
-
- 07 10月, 2015 2 次提交
-
-
由 Feng Kan 提交于
This patch optimize copy_to-from-in_user for arm 64bit architecture. The copy template is used as template file for all the copy*.S files. Minor change was made to it to accommodate the copy to/from/in user files. Signed-off-by: NFeng Kan <fkan@apm.com> Signed-off-by: NBalamurugan Shanmugam <bshanmugam@apm.com> Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
-
由 Feng Kan 提交于
This converts the memcpy.S to use the copy template file. The copy template file was based originally on the memcpy.S Signed-off-by: NFeng Kan <fkan@apm.com> Signed-off-by: NBalamurugan Shanmugam <bshanmugam@apm.com> [catalin.marinas@arm.com: removed tmp3(w) .req statements as they are not used] Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
-
- 27 7月, 2015 5 次提交
-
-
由 Will Deacon 提交于
The cost of changing a cacheline from shared to exclusive state can be significant, especially when this is triggered by an exclusive store, since it may result in having to retry the transaction. This patch makes use of prfm to prefetch cachelines for write prior to ldxr/stxr loops when using the ll/sc atomic routines. Reviewed-by: NCatalin Marinas <catalin.marinas@arm.com> Signed-off-by: NWill Deacon <will.deacon@arm.com>
-
由 Will Deacon 提交于
On CPUs which support the LSE atomic instructions introduced in ARMv8.1, it makes sense to use them in preference to ll/sc sequences. This patch introduces runtime patching of our bitops functions so that LSE atomic instructions are used instead. Reviewed-by: NSteve Capper <steve.capper@arm.com> Reviewed-by: NCatalin Marinas <catalin.marinas@arm.com> Signed-off-by: NWill Deacon <will.deacon@arm.com>
-
由 Will Deacon 提交于
In order to patch in the new atomic instructions at runtime, we need to generate wrappers around the out-of-line exclusive load/store atomics. This patch adds a new Kconfig option, CONFIG_ARM64_LSE_ATOMICS. which causes our atomic functions to branch to the out-of-line ll/sc implementations. To avoid the register spill overhead of the PCS, the out-of-line functions are compiled with specific compiler flags to force out-of-line save/restore of any registers that are usually caller-saved. Reviewed-by: NCatalin Marinas <catalin.marinas@arm.com> Signed-off-by: NWill Deacon <will.deacon@arm.com>
-
由 James Morse 提交于
'Privileged Access Never' is a new arm8.1 feature which prevents privileged code from accessing any virtual address where read or write access is also permitted at EL0. This patch enables the PAN feature on all CPUs, and modifies {get,put}_user helpers temporarily to permit access. This will catch kernel bugs where user memory is accessed directly. 'Unprivileged loads and stores' using ldtrb et al are unaffected by PAN. Reviewed-by: NCatalin Marinas <catalin.marinas@arm.com> Signed-off-by: NJames Morse <james.morse@arm.com> [will: use ALTERNATIVE in asm and tidy up pan_enable check] Signed-off-by: NWill Deacon <will.deacon@arm.com>
-
由 Will Deacon 提交于
The AArch64 instruction set contains load/store pair memory accessors, so use these in our copy_*_user routines to transfer 16 bytes per iteration. Reviewed-by: NCatalin Marinas <catalin.marinas@arm.com> Signed-off-by: NWill Deacon <will.deacon@arm.com>
-
- 13 11月, 2014 1 次提交
-
-
由 Kyle McMartin 提交于
ARM64 currently doesn't fix up faults on the single-byte (strb) case of __clear_user... which means that we can cause a nasty kernel panic as an ordinary user with any multiple PAGE_SIZE+1 read from /dev/zero. i.e.: dd if=/dev/zero of=foo ibs=1 count=1 (or ibs=65537, etc.) This is a pretty obscure bug in the general case since we'll only __do_kernel_fault (since there's no extable entry for pc) if the mmap_sem is contended. However, with CONFIG_DEBUG_VM enabled, we'll always fault. if (!down_read_trylock(&mm->mmap_sem)) { if (!user_mode(regs) && !search_exception_tables(regs->pc)) goto no_context; retry: down_read(&mm->mmap_sem); } else { /* * The above down_read_trylock() might have succeeded in * which * case, we'll have missed the might_sleep() from * down_read(). */ might_sleep(); if (!user_mode(regs) && !search_exception_tables(regs->pc)) goto no_context; } Fix that by adding an extable entry for the strb instruction, since it touches user memory, similar to the other stores in __clear_user. Signed-off-by: NKyle McMartin <kyle@redhat.com> Reported-by: NMiloš Prchlík <mprchlik@redhat.com> Cc: stable@vger.kernel.org Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
-
- 23 5月, 2014 6 次提交
-
-
由 zhichang.yuan 提交于
This patch, based on Linaro's Cortex Strings library, adds an assembly optimized strlen() and strnlen() functions. Signed-off-by: NZhichang Yuan <zhichang.yuan@linaro.org> Signed-off-by: NDeepak Saxena <dsaxena@linaro.org> Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
-
由 zhichang.yuan 提交于
This patch, based on Linaro's Cortex Strings library, adds an assembly optimized strcmp() and strncmp() functions. Signed-off-by: NZhichang Yuan <zhichang.yuan@linaro.org> Signed-off-by: NDeepak Saxena <dsaxena@linaro.org> Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
-
由 zhichang.yuan 提交于
This patch, based on Linaro's Cortex Strings library, adds an assembly optimized memcmp() function. Signed-off-by: NZhichang Yuan <zhichang.yuan@linaro.org> Signed-off-by: NDeepak Saxena <dsaxena@linaro.org> Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
-
由 zhichang.yuan 提交于
This patch, based on Linaro's Cortex Strings library, improves the performance of the assembly optimized memset() function. Signed-off-by: NZhichang Yuan <zhichang.yuan@linaro.org> Signed-off-by: NDeepak Saxena <dsaxena@linaro.org> Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
-
由 zhichang.yuan 提交于
This patch, based on Linaro's Cortex Strings library, improves the performance of the assembly optimized memmove() function. Signed-off-by: NZhichang Yuan <zhichang.yuan@linaro.org> Signed-off-by: NDeepak Saxena <dsaxena@linaro.org> Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
-
由 zhichang.yuan 提交于
This patch, based on Linaro's Cortex Strings library, improves the performance of the assembly optimized memcpy() function. Signed-off-by: NZhichang Yuan <zhichang.yuan@linaro.org> Signed-off-by: NDeepak Saxena <dsaxena@linaro.org> Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
-
- 08 2月, 2014 1 次提交
-
-
由 Will Deacon 提交于
Linux requires a number of atomic operations to provide full barrier semantics, that is no memory accesses after the operation can be observed before any accesses up to and including the operation in program order. On arm64, these operations have been incorrectly implemented as follows: // A, B, C are independent memory locations <Access [A]> // atomic_op (B) 1: ldaxr x0, [B] // Exclusive load with acquire <op(B)> stlxr w1, x0, [B] // Exclusive store with release cbnz w1, 1b <Access [C]> The assumption here being that two half barriers are equivalent to a full barrier, so the only permitted ordering would be A -> B -> C (where B is the atomic operation involving both a load and a store). Unfortunately, this is not the case by the letter of the architecture and, in fact, the accesses to A and C are permitted to pass their nearest half barrier resulting in orderings such as Bl -> A -> C -> Bs or Bl -> C -> A -> Bs (where Bl is the load-acquire on B and Bs is the store-release on B). This is a clear violation of the full barrier requirement. The simple way to fix this is to implement the same algorithm as ARMv7 using explicit barriers: <Access [A]> // atomic_op (B) dmb ish // Full barrier 1: ldxr x0, [B] // Exclusive load <op(B)> stxr w1, x0, [B] // Exclusive store cbnz w1, 1b dmb ish // Full barrier <Access [C]> but this has the undesirable effect of introducing *two* full barrier instructions. A better approach is actually the following, non-intuitive sequence: <Access [A]> // atomic_op (B) 1: ldxr x0, [B] // Exclusive load <op(B)> stlxr w1, x0, [B] // Exclusive store with release cbnz w1, 1b dmb ish // Full barrier <Access [C]> The simple observations here are: - The dmb ensures that no subsequent accesses (e.g. the access to C) can enter or pass the atomic sequence. - The dmb also ensures that no prior accesses (e.g. the access to A) can pass the atomic sequence. - Therefore, no prior access can pass a subsequent access, or vice-versa (i.e. A is strictly ordered before C). - The stlxr ensures that no prior access can pass the store component of the atomic operation. The only tricky part remaining is the ordering between the ldxr and the access to A, since the absence of the first dmb means that we're now permitting re-ordering between the ldxr and any prior accesses. From an (arbitrary) observer's point of view, there are two scenarios: 1. We have observed the ldxr. This means that if we perform a store to [B], the ldxr will still return older data. If we can observe the ldxr, then we can potentially observe the permitted re-ordering with the access to A, which is clearly an issue when compared to the dmb variant of the code. Thankfully, the exclusive monitor will save us here since it will be cleared as a result of the store and the ldxr will retry. Notice that any use of a later memory observation to imply observation of the ldxr will also imply observation of the access to A, since the stlxr/dmb ensure strict ordering. 2. We have not observed the ldxr. This means we can perform a store and influence the later ldxr. However, that doesn't actually tell us anything about the access to [A], so we've not lost anything here either when compared to the dmb variant. This patch implements this solution for our barriered atomic operations, ensuring that we satisfy the full barrier requirements where they are needed. Cc: <stable@vger.kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by: NWill Deacon <will.deacon@arm.com> Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
-
- 20 12月, 2013 1 次提交
-
-
由 Will Deacon 提交于
This patch implements the word-at-a-time interface for arm64 using the same algorithm as ARM. We use the fls64 macro, which expands to a clz instruction via a compiler builtin. Big-endian configurations make use of the implementation from asm-generic. With this implemented, we can replace our byte-at-a-time strnlen_user and strncpy_from_user functions with the optimised generic versions. Signed-off-by: NWill Deacon <will.deacon@arm.com> Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
-
- 08 5月, 2013 1 次提交
-
-
由 Catalin Marinas 提交于
The bitops prototype use an 'int' as the bit index type but the asm implementation assume it to be a 'long'. Since the compiler does not guarantee zeroing the upper 32-bits in a register when used as 'int', change the bitops implementation accordingly. Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
-
- 30 4月, 2013 2 次提交
-
-
由 Catalin Marinas 提交于
This patch changes the test_and_*_bit functions to use the load-acquire/store-release instructions instead of explicit DMB. Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
-
由 Mark Rutland 提交于
We're currently relying on unpredictable behaviour in our testops (test_and_*_bit), as stxr is unpredictable when the status register and the source register are the same This patch changes reallocates the status register so as to bring us back into the realm of predictable behaviour. Boot tested on an AEMv8 model. Signed-off-by: NMark Rutland <mark.rutland@arm.com> Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
-
- 22 3月, 2013 3 次提交
-
-
由 Catalin Marinas 提交于
This patch implements the AArch64-specific atomic bitops functions using exclusive memory accesses to avoid locking. Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
-
由 Catalin Marinas 提交于
This patch introduces AArch64-specific string functions (strchr, strrchr). Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
-
由 Catalin Marinas 提交于
This patch introduces AArch64-specific memory functions (memcpy, memmove, memchr, memset). These functions are not optimised for any CPU implementation but can be used as a starting point once hardware is available. Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
-
- 17 9月, 2012 2 次提交
-
-
由 Marc Zyngier 提交于
This patch adds udelay, memory and bit operations together with the ksyms exports. Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com> Signed-off-by: NWill Deacon <will.deacon@arm.com> Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com> Acked-by: NTony Lindgren <tony@atomide.com> Acked-by: NNicolas Pitre <nico@linaro.org> Acked-by: NOlof Johansson <olof@lixom.net> Acked-by: NSantosh Shilimkar <santosh.shilimkar@ti.com>
-
由 Catalin Marinas 提交于
This patch add support for various user access functions. These functions use the standard LDR/STR instructions and not the LDRT/STRT variants in order to allow kernel addresses (after set_fs(KERNEL_DS)). Signed-off-by: NWill Deacon <will.deacon@arm.com> Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com> Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com> Acked-by: NTony Lindgren <tony@atomide.com> Acked-by: NNicolas Pitre <nico@linaro.org> Acked-by: NOlof Johansson <olof@lixom.net> Acked-by: NSantosh Shilimkar <santosh.shilimkar@ti.com> Acked-by: NArnd Bergmann <arnd@arndb.de>
-