- 20 3月, 2015 5 次提交
-
-
由 Ard Biesheuvel 提交于
The global processor_id is assigned the MIDR_EL1 value of the boot CPU in the early init code, but is never referenced afterwards. As the relevance of the MIDR_EL1 value of the boot CPU is debatable anyway, especially under big.LITTLE, let's remove it before anyone starts using it. Tested-by: NMark Rutland <mark.rutland@arm.com> Reviewed-by: NMark Rutland <mark.rutland@arm.com> Reviewed-by: NCatalin Marinas <catalin.marinas@arm.com> Signed-off-by: NArd Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: NWill Deacon <will.deacon@arm.com>
-
由 Ard Biesheuvel 提交于
The adrp instruction is mostly used in combination with either an add, a ldr or a str instruction with the low bits of the referenced symbol in the 12-bit immediate of the followup instruction. Introduce the macros adr_l, ldr_l and str_l that encapsulate these common patterns. Tested-by: NMark Rutland <mark.rutland@arm.com> Reviewed-by: NMark Rutland <mark.rutland@arm.com> Signed-off-by: NArd Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: NWill Deacon <will.deacon@arm.com>
-
由 Marc Zyngier 提交于
struct cpu_table is an artifact left from the (very) early days of the arm64 port, and its only real use is to allow the most beautiful "AArch64 Processor" string to be displayed at boot time. Really? Yes, really. Let's get rid of it. In order to avoid another BogoMips-gate, the aforementioned string is preserved. Acked-by: NMark Rutland <mark.rutland@arm.com> Acked-by: NCatalin Marinas <catalin.marinas@arm.com> Acked-by: NArd Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com> Signed-off-by: NArd Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: NWill Deacon <will.deacon@arm.com>
-
由 Ganapatrao Kulkarni 提交于
Raise the maximum CPU limit to 4096 in preparation for upcoming platforms with large core counts. Acked-by: NArnd Bergmann <arnd@arndb.de> Signed-off-by: NGanapatrao Kulkarni <gkulkarni@caviumnetworks.com> Signed-off-by: NWill Deacon <will.deacon@arm.com>
-
由 Suzuki K. Poulose 提交于
The perf core implicitly rejects events spanning multiple HW PMUs, as in these cases the event->ctx will differ. However this validation is performed after pmu::event_init() is called in perf_init_event(), and thus pmu::event_init() may be called with a group leader from a different HW PMU. The ARM64 PMU driver does not take this fact into account, and when validating groups assumes that it can call to_arm_pmu(event->pmu) for any HW event. When the event in question is from another HW PMU this is wrong, and results in dereferencing garbage. This patch updates the ARM64 PMU driver to first test for and reject events from other PMUs, moving the to_arm_pmu and related logic after this test. Fixes a crash triggered by perf_fuzzer on Linux-4.0-rc2, with a CCI PMU present: Bad mode in Synchronous Abort handler detected, code 0x86000006 -- IABT (current EL) CPU: 0 PID: 1371 Comm: perf_fuzzer Not tainted 3.19.0+ #249 Hardware name: V2F-1XV7 Cortex-A53x2 SMM (DT) task: ffffffc07c73a280 ti: ffffffc07b0a0000 task.ti: ffffffc07b0a0000 PC is at 0x0 LR is at validate_event+0x90/0xa8 pc : [<0000000000000000>] lr : [<ffffffc000090228>] pstate: 00000145 sp : ffffffc07b0a3ba0 [< (null)>] (null) [<ffffffc0000907d8>] armpmu_event_init+0x174/0x3cc [<ffffffc00015d870>] perf_try_init_event+0x34/0x70 [<ffffffc000164094>] perf_init_event+0xe0/0x10c [<ffffffc000164348>] perf_event_alloc+0x288/0x358 [<ffffffc000164c5c>] SyS_perf_event_open+0x464/0x98c Code: bad PC value Also cleans up the code to use the arm_pmu only when we know that we are dealing with an arm pmu event. Cc: Will Deacon <will.deacon@arm.com> Acked-by: NMark Rutland <mark.rutland@arm.com> Acked-by: NPeter Ziljstra (Intel) <peterz@infradead.org> Signed-off-by: NSuzuki K. Poulose <suzuki.poulose@arm.com> Signed-off-by: NWill Deacon <will.deacon@arm.com>
-
- 19 3月, 2015 6 次提交
-
-
由 Ard Biesheuvel 提交于
This changes the AES core transform implementations to issue aese/aesmc (and aesd/aesimc) in pairs. This enables a micro-architectural optimization in recent Cortex-A5x cores that improves performance by 50-90%. Measured performance in cycles per byte (Cortex-A57): CBC enc CBC dec CTR before 3.64 1.34 1.32 after 1.95 0.85 0.93 Note that this results in a ~5% performance decrease for older cores. Signed-off-by: NArd Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: NWill Deacon <will.deacon@arm.com>
-
由 Mark Rutland 提交于
Fixmap indices are in the interval (FIX_HOLE, __end_of_fixed_addresses), but in __set_fixmap we only check idx <= __end_of_fixed_addresses, and therefore indices <= FIX_HOLE are erroneously accepted. If called with such an idx, __set_fixmap may corrupt page tables outside of the fixmap region. This patch ensures that we validate the idx against both endpoints of the interval. Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Kees Cook <keescook@chromium.org> Acked-by: NArd Biesheuvel <ard.biesheuvel@linaro.org> Acked-by: NLaura Abbott <lauraa@codeaurora.org> Signed-off-by: NMark Rutland <mark.rutland@arm.com> Signed-off-by: NWill Deacon <will.deacon@arm.com>
-
由 Mark Rutland 提交于
The FIX_TEXT_POKE0 is currently at the end of the temporary fixmap slots, despite the fact that it can be used at any point during runtime (e.g. for poking the text of loaded modules), and thus should be a permanent fixmap slot (as is the case on arm and x86). This patch moves FIX_TEXT_POKE0 into the set of permanent fixmap slots. Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Kees Cook <keescook@chromium.org> Acked-by: NArd Biesheuvel <ard.biesheuvel@linaro.org> Acked-by: NLaura Abbott <lauraa@codeaurora.org> Signed-off-by: NMark Rutland <mark.rutland@arm.com> Signed-off-by: NWill Deacon <will.deacon@arm.com>
-
由 Daniel Borkmann 提交于
This effectively unexports set_memory_ro and set_memory_rw functions from commit 11d91a77 ("arm64: Add CONFIG_DEBUG_SET_MODULE_RONX support"). No module user of those is in mainline kernel and we explicitly do not want modules to use these functions, as they i.e. RO-protect eBPF (interpreted and JIT'ed) images from malicious modifications/bugs. Outside of eBPF scope, I believe also other set_memory_* functions should be unexported on arm64 due to non-existant mainline module user. Laura mentioned that they have some uses for modules doing set_memory_*, but none that are in mainline and it's unclear if they would ever get there. Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net> Acked-by: NAlexei Starovoitov <ast@plumgrid.com> Acked-by: NLaura Abbott <lauraa@codeaurora.org> Signed-off-by: NWill Deacon <will.deacon@arm.com>
-
由 Alexander Graf 提交于
With binutils 2.25 the default alignment for 32bit arm sections changed to have everything 64k aligned. Armv7 binaries built with this binutils version run successfully on an arm64 system. Since effectively there is now the chance to run armv7 code on arm64 even with 64k page size, it doesn't make sense to block people from enabling CONFIG_COMPAT on those configurations. Signed-off-by: NAlexander Graf <agraf@suse.de> Signed-off-by: NWill Deacon <will.deacon@arm.com>
-
由 Andreas Schwab 提交于
The arm mmap2 syscall takes the offset in units of 4K, thus with 64K pages the offset needs to be scaled to units of pages. Signed-off-by: NAndreas Schwab <schwab@suse.de> Signed-off-by: NAlexander Graf <agraf@suse.de> [will: removed redundant lr parameter, localised PAGE_SHIFT #if check] Signed-off-by: NWill Deacon <will.deacon@arm.com>
-
- 18 3月, 2015 4 次提交
-
-
由 Steve Capper 提交于
Commit f4f75ad5 ("efi: efistub: Convert into static library") introduced a static library for EFI stub, libstub. The EFI libstub directory is referenced by the kernel build system via a obj subdirectory rule in: drivers/firmware/efi/Makefile Unfortunately, arm64 also references the EFI libstub via: libs-$(CONFIG_EFI_STUB) += drivers/firmware/efi/libstub/ If we're unlucky, the kernel build system can enter libstub via two simultaneous threads resulting in build failures such as: fixdep: error opening depfile: drivers/firmware/efi/libstub/.efi-stub-helper.o.d: No such file or directory scripts/Makefile.build:257: recipe for target 'drivers/firmware/efi/libstub/efi-stub-helper.o' failed make[1]: *** [drivers/firmware/efi/libstub/efi-stub-helper.o] Error 2 Makefile:939: recipe for target 'drivers/firmware/efi/libstub' failed make: *** [drivers/firmware/efi/libstub] Error 2 make: *** Waiting for unfinished jobs.... This patch adjusts the arm64 Makefile to reference the compiled library explicitly (as is currently done in x86), rather than the directory. Fixes: f4f75ad5 efi: efistub: Convert into static library Signed-off-by: NSteve Capper <steve.capper@linaro.org> Signed-off-by: NWill Deacon <will.deacon@arm.com>
-
由 Mark Rutland 提交于
We currently don't log the boot mode for arm64 as we do for arm, and without KVM the user is provided with no indication as to which mode(s) CPUs were booted in, which can seriously hinder debugging in some cases. Add logging to the boot path once all CPUs are up. Where CPUs are mismatched in violation of the boot protocol, WARN and set a taint (as we do for CPU other CPU feature mismatches) given that the firmware/bootloader is buggy and should be fixed. Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Marc Zyngier <marc.zyngier@arm.com> Tested-by: NArd Biesheuvel <ard.biesheuvel@linaro.org> Reviewed-by: NArd Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: NMark Rutland <mark.rutland@arm.com> Signed-off-by: NWill Deacon <will.deacon@arm.com>
-
由 Mark Rutland 提交于
Commit 828e9834 ("arm64: head: create a new function for setting the boot_cpu_mode flag") added BOOT_CPU_MODE_EL1, a nonzero value replacing uses of zero. However it failed to update __boot_cpu_mode appropriately. A CPU booted at EL2 writes BOOT_CPU_MODE_EL2 to __boot_cpu_mode[0], and a CPU booted at EL1 writes BOOT_CPU_MODE_EL1 to __boot_cpu_mode[1]. Later is_hyp_mode_mismatched() determines there to be a mismatch if __boot_cpu_mode[0] != __boot_cpu_mode[1]. If all CPUs are booted at EL1, __boot_cpu_mode[0] will be set to BOOT_CPU_MODE_EL1, but __boot_cpu_mode[1] will retain its initial value of zero, and is_hyp_mode_mismatched will erroneously determine that the boot modes are mismatched. This hasn't been a problem so far, but later patches which will make use of is_hyp_mode_mismatched() expect it to work correctly. This patch initialises __boot_cpu_mode[1] to BOOT_CPU_MODE_EL1, fixing the erroneous mismatch detection when all CPUs are booted at EL1. Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Marc Zyngier <marc.zyngier@arm.com> Tested-by: NArd Biesheuvel <ard.biesheuvel@linaro.org> Reviewed-by: NArd Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: NMark Rutland <mark.rutland@arm.com> Signed-off-by: NWill Deacon <will.deacon@arm.com>
-
由 Mark Rutland 提交于
Currently we only perform alternative patching for kernels built with CONFIG_SMP, as we call apply_alternatives_all() in smp.c, which is only built for CONFIG_SMP. Thus !SMP kernels may not have necessary alternatives patched in. This patch ensures that we call apply_alternatives_all() once all CPUs are booted, even for !SMP kernels, by having the smp_init_cpus() stub call this for !SMP kernels via up_late_init. A new wrapper, do_post_cpus_up_work, is added so we can hook other calls here later (e.g. boot mode logging). Cc: Andre Przywara <andre.przywara@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Fixes: e039ee4e ("arm64: add alternative runtime patching") Tested-by: NArd Biesheuvel <ard.biesheuvel@linaro.org> Reviewed-by: NArd Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: NMark Rutland <mark.rutland@arm.com> Signed-off-by: NWill Deacon <will.deacon@arm.com>
-
- 17 3月, 2015 1 次提交
-
-
由 Peter Crosthwaite 提交于
ARM64 has the yield nop hint which has the intended semantics of cpu_relax. Implement. The immediate application is ARM CPU emulators. An emulator can take advantage of the yield hint to de-prioritise an emulated CPU in favor of other emulation tasks. QEMU A64 SMP emulation has yield awareness, and sees a significant boot time performance increase with this change. Signed-off-by: NPeter Crosthwaite <peter.crosthwaite@xilinx.com> Acked-by: NWill Deacon <will.deacon@arm.com> Signed-off-by: NWill Deacon <will.deacon@arm.com>
-
- 14 3月, 2015 3 次提交
-
-
由 Ard Biesheuvel 提交于
Another one for the big head.S spring cleaning: the label should be after the .align or it may point to the padding. Signed-off-by: NArd Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
-
由 Ard Biesheuvel 提交于
If UEFI Runtime Services are available, they are preferred over direct PSCI calls or other methods to reset the system. For the reset case, we need to hook into machine_restart(), as the arm_pm_restart function pointer may be overwritten by modules. Tested-by: NMark Rutland <mark.rutland@arm.com> Reviewed-by: NMark Rutland <mark.rutland@arm.com> Reviewed-by: NMatt Fleming <matt.fleming@intel.com> Signed-off-by: NArd Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
-
由 Catalin Marinas 提交于
The ARM architecture allows the caching of intermediate page table levels and page table freeing requires a sequence like: pmd_clear() TLB invalidation pte page freeing With commit 5e5f6dc1 (arm64: mm: enable HAVE_RCU_TABLE_FREE logic), the page table freeing batching was moved from tlb_remove_page() to tlb_remove_table(). The former takes care of TLB invalidation as this is also shared with pte clearing and page cache page freeing. The latter, however, does not invalidate the TLBs for intermediate page table levels as it probably relies on the architecture code to do it if required. When the mm->mm_users < 2, tlb_remove_table() does not do any batching and page table pages are freed before tlb_finish_mmu() which performs the actual TLB invalidation. This patch introduces __tlb_flush_pgtable() for arm64 and calls it from the {pte,pmd,pud}_free_tlb() directly without relying on deferred page table freeing. Fixes: 5e5f6dc1 arm64: mm: enable HAVE_RCU_TABLE_FREE logic Reported-by: NJon Masters <jcm@redhat.com> Tested-by: NJon Masters <jcm@redhat.com> Tested-by: NSteve Capper <steve.capper@linaro.org> Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
-
- 06 3月, 2015 1 次提交
-
-
由 Laura Abbott 提交于
The set_memory_* functions currently only support module addresses. The addresses are validated using is_module_addr. That function is special though and relies on internal state in the module subsystem to work properly. At the time of module initialization and calling set_memory_*, it's too early for is_module_addr to work properly so it always returns false. Rather than be subject to the whims of the module state, just bounds check against the module virtual address range. Signed-off-by: NLaura Abbott <lauraa@codeaurora.org> Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
-
- 05 3月, 2015 1 次提交
-
-
由 Iyappan Subramanian 提交于
This patch fixes the backward compatibility of the older driver with the newer firmware by making the binding unique so that the older driver won't recognize the non-supported interfaces. The new bindings are in sync with the newer firmware. Signed-off-by: NIyappan Subramanian <isubramanian@apm.com> Signed-off-by: NKeyur Chudgar <kchudgar@apm.com> Tested-by: NMark Langsdorf <mlangsdo@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 28 2月, 2015 3 次提交
-
-
由 Lorenzo Pieralisi 提交于
ARM64 CPUidle driver requires the cpu_do_idle function so that it can be used to enter the shallowest idle state, and it is declared in asm/proc-fns.h. The current ARM64 CPUidle driver does not include asm/proc-fns.h explicitly and it has so far relied on implicit inclusion from other header files. Owing to some header dependencies reshuffling this currently triggers build failures when CONFIG_ARM64_64K_PAGES=y: drivers/cpuidle/cpuidle-arm64.c: In function "arm64_enter_idle_state" drivers/cpuidle/cpuidle-arm64.c:42:3: error: implicit declaration of function "cpu_do_idle" [-Werror=implicit-function-declaration] cpu_do_idle(); ^ This patch adds the explicit inclusion of the asm/proc-fns.h header file in the arm64 asm/cpuidle.h header file, so that the build breakage is fixed and the required header inclusion is added to the appropriate arch back-end CPUidle header, already included by the CPUidle arm64 driver, where CPUidle arch related function declarations belong. Reported-by: NLaura Abbott <lauraa@codeaurora.org> Signed-off-by: NLorenzo Pieralisi <lorenzo.pieralisi@arm.com> Acked-by: NWill Deacon <will.deacon@arm.com> Tested-by: NMark Rutland <mark.rutland@arm.com> Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
-
由 Catalin Marinas 提交于
The native (64-bit) sigval_t union contains sival_int (32-bit) and sival_ptr (64-bit). When a compat application invokes a syscall that takes a sigval_t value (as part of a larger structure, e.g. compat_sys_mq_notify, compat_sys_timer_create), the compat_sigval_t union is converted to the native sigval_t with sival_int overlapping with either the least or the most significant half of sival_ptr, depending on endianness. When the corresponding signal is delivered to a compat application, on big endian the current (compat_uptr_t)sival_ptr cast always returns 0 since sival_int corresponds to the top part of sival_ptr. This patch fixes copy_siginfo_to_user32() so that sival_int is copied to the compat_siginfo_t structure. Cc: <stable@vger.kernel.org> Reported-by: NBamvor Jian Zhang <bamvor.zhangjian@huawei.com> Tested-by: NBamvor Jian Zhang <bamvor.zhangjian@huawei.com> Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
-
由 Catalin Marinas 提交于
With commit 3690951f (arm64: Use swiotlb late initialisation), the swiotlb buffer size is limited to MAX_ORDER_NR_PAGES. However, there are platforms with 32-bit only devices that require bounce buffering via swiotlb. This patch changes the swiotlb initialisation to an early 64MB memblock allocation. In order to get the swiotlb buffer correctly allocated (via memblock_virt_alloc_low_nopanic), this patch also defines ARCH_LOW_ADDRESS_LIMIT to the maximum physical address capable of 32-bit DMA. Reported-by: NKefeng Wang <wangkefeng.wang@huawei.com> Tested-by: NKefeng Wang <wangkefeng.wang@huawei.com> Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
-
- 27 2月, 2015 6 次提交
-
-
由 Marc Zyngier 提交于
Patch 2f896d58 ("arm64: use fixmap for text patching") changed the way we patch the kernel text, using a fixmap when the kernel or modules are flagged as read only. Unfortunately, a flaw in the logic makes it fall over when patching modules without CONFIG_DEBUG_SET_MODULE_RONX enabled: [...] [ 32.032636] Call trace: [ 32.032716] [<fffffe00003da0dc>] __copy_to_user+0x2c/0x60 [ 32.032837] [<fffffe0000099f08>] __aarch64_insn_write+0x94/0xf8 [ 32.033027] [<fffffe000009a0a0>] aarch64_insn_patch_text_nosync+0x18/0x58 [ 32.033200] [<fffffe000009c3ec>] ftrace_modify_code+0x58/0x84 [ 32.033363] [<fffffe000009c4e4>] ftrace_make_nop+0x3c/0x58 [ 32.033532] [<fffffe0000164420>] ftrace_process_locs+0x3d0/0x5c8 [ 32.033709] [<fffffe00001661cc>] ftrace_module_init+0x28/0x34 [ 32.033882] [<fffffe0000135148>] load_module+0xbb8/0xfc4 [ 32.034044] [<fffffe0000135714>] SyS_finit_module+0x94/0xc4 [...] This is triggered by the use of virt_to_page() on a module address, which ends to pointing to Nowhereland if you're lucky, or corrupt your precious data if not. This patch fixes the logic by mimicking what is done on arm: - If we're patching a module and CONFIG_DEBUG_SET_MODULE_RONX is set, use vmalloc_to_page(). - If we're patching the kernel and CONFIG_DEBUG_RODATA is set, use virt_to_page(). - Otherwise, use the provided address, as we can write to it directly. Tested on 4.0-rc1 as a KVM guest. Reported-by: NRichard W.M. Jones <rjones@redhat.com> Reviewed-by: NKees Cook <keescook@chromium.org> Acked-by: NMark Rutland <mark.rutland@arm.com> Acked-by: NLaura Abbott <lauraa@codeaurora.org> Tested-by: NRichard W.M. Jones <rjones@redhat.com> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com> Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
-
由 Ard Biesheuvel 提交于
This patch increases the interleave factor for parallel AES modes to 4x. This improves performance on Cortex-A57 by ~35%. This is due to the 3-cycle latency of AES instructions on the A57's relatively deep pipeline (compared to Cortex-A53 where the AES instruction latency is only 2 cycles). At the same time, disable inline expansion of the core AES functions, as the performance benefit of this feature is negligible. Measured on AMD Seattle (using tcrypt.ko mode=500 sec=1): Baseline (2x interleave, inline expansion) ------------------------------------------ testing speed of async cbc(aes) (cbc-aes-ce) decryption test 4 (128 bit key, 8192 byte blocks): 95545 operations in 1 seconds test 14 (256 bit key, 8192 byte blocks): 68496 operations in 1 seconds This patch (4x interleave, no inline expansion) ----------------------------------------------- testing speed of async cbc(aes) (cbc-aes-ce) decryption test 4 (128 bit key, 8192 byte blocks): 124735 operations in 1 seconds test 14 (256 bit key, 8192 byte blocks): 92328 operations in 1 seconds Signed-off-by: NArd Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
-
由 Feng Kan 提交于
Caught during Trinity testing. The pte_modify does not allow modification for PTE type bit. This cause the test to hang the system. It is found that the PTE can't transit from an inaccessible page (b00) to a valid page (b11) because the mask does not allow it. This happens when a big block of mmaped memory is set the PROT_NONE, then the a small piece is broken off and set to PROT_WRITE | PROT_READ cause a huge page split. Signed-off-by: NFeng Kan <fkan@apm.com> Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
-
由 Yingjoe Chen 提交于
The functions __cpu_flush_user_tlb_range and __cpu_flush_kern_tlb_range were removed in commit fa48e6f7 'arm64: mm: Optimise tlb flush logic where we have >4K granule'. Global variable cpu_tlb was never used in arm64. Remove them. Signed-off-by: NYingjoe Chen <yingjoe.chen@mediatek.com> Acked-by: NWill Deacon <will.deacon@arm.com> Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
-
由 Will Deacon 提交于
An arm64 allmodconfig fails to build with GCC 5 due to __asmeq assertions in the PSCI firmware calling code firing due to mcount preambles breaking our assumptions about register allocation of function arguments: /tmp/ccDqJsJ6.s: Assembler messages: /tmp/ccDqJsJ6.s:60: Error: .err encountered /tmp/ccDqJsJ6.s:61: Error: .err encountered /tmp/ccDqJsJ6.s:62: Error: .err encountered /tmp/ccDqJsJ6.s:99: Error: .err encountered /tmp/ccDqJsJ6.s:100: Error: .err encountered /tmp/ccDqJsJ6.s:101: Error: .err encountered This patch fixes the issue by moving the PSCI calls out-of-line into their own assembly files, which are safe from the compiler's meddling fingers. Reported-by: NAndy Whitcroft <apw@canonical.com> Signed-off-by: NWill Deacon <will.deacon@arm.com> Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
-
由 Nathan Lynch 提交于
The vdso implementation of clock_getres currently returns 0 (success) whenever a null timespec is provided by the caller, regardless of the clock id supplied. This behavior is incorrect. It should fall back to syscall when an unrecognized clock id is passed, even when the timespec argument is null. This ensures that clock_getres always returns an error for invalid clock ids. Signed-off-by: NNathan Lynch <nathan_lynch@mentor.com> Acked-by: NWill Deacon <will.deacon@arm.com> Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
-
- 26 2月, 2015 1 次提交
-
-
由 Sudeep Holla 提交于
Commit 5d425c18 ("arm64: kernel: add support for cpu cache information") adds cacheinfo support for ARM64. Since there's no architectural way of detecting the cpus that share particular cache, device tree can be used and the core cacheinfo already supports the same. This patch adds the L2 cache topology on Juno board, FVP/RTSM and foundation models. Signed-off-by: NSudeep Holla <sudeep.holla@arm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Liviu Dudau <Liviu.Dudau@arm.com> Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Signed-off-by: NArnd Bergmann <arnd@arndb.de>
-
- 23 2月, 2015 3 次提交
-
-
由 Marc Zyngier 提交于
asm/assembler.h lacks the usual guard against multiple inclusion, leading to a compilation failure if it is accidentally included twice. Using the classic #ifndef/#define/#endif construct solves the issue. Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com> Signed-off-by: NWill Deacon <will.deacon@arm.com>
-
由 Robin Murphy 提交于
Fix cbz/cbnz having the mask offset by a bit, and add encodings for tbz/tbnz so that all branch forms are represented. Signed-off-by: NRobin Murphy <robin.murphy@arm.com> Acked-by: NWill Deacon <will.deacon@arm.com> Acked-by: NZi Shen Lim <zlim.lnx@gmail.com> Signed-off-by: NWill Deacon <will.deacon@arm.com>
-
由 Pratyush Anand 提交于
ftrace_enable_ftrace_graph_caller and ftrace_disable_ftrace_graph_caller should replace B(jmp) instruction and not BL(call) instruction. Commit 9f1ae759("arm64: Correct ftrace calls to aarch64_insn_gen_branch_imm()") had a typo and used AARCH64_INSN_BRANCH_LINK instead of AARCH64_INSN_BRANCH_NOLINK. Either instruction will work, as the link register is saved/restored across the branch but this better matches the intention of the code. Signed-off-by: NPratyush Anand <panand@redhat.com> Signed-off-by: NWill Deacon <will.deacon@arm.com>
-
- 14 2月, 2015 1 次提交
-
-
由 Andrey Ryabinin 提交于
For instrumenting global variables KASan will shadow memory backing memory for modules. So on module loading we will need to allocate memory for shadow and map it at address in shadow that corresponds to the address allocated in module_alloc(). __vmalloc_node_range() could be used for this purpose, except it puts a guard hole after allocated area. Guard hole in shadow memory should be a problem because at some future point we might need to have a shadow memory at address occupied by guard hole. So we could fail to allocate shadow for module_alloc(). Now we have VM_NO_GUARD flag disabling guard page, so we need to pass into __vmalloc_node_range(). Add new parameter 'vm_flags' to __vmalloc_node_range() function. Signed-off-by: NAndrey Ryabinin <a.ryabinin@samsung.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Konstantin Serebryany <kcc@google.com> Cc: Dmitry Chernenkov <dmitryc@google.com> Signed-off-by: NAndrey Konovalov <adech.fo@gmail.com> Cc: Yuri Gribov <tetra2005@gmail.com> Cc: Konstantin Khlebnikov <koct9i@gmail.com> Cc: Sasha Levin <sasha.levin@oracle.com> Cc: Christoph Lameter <cl@linux.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Christoph Lameter <cl@linux.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: David Rientjes <rientjes@google.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 13 2月, 2015 1 次提交
-
-
由 Andy Lutomirski 提交于
If an attacker can cause a controlled kernel stack overflow, overwriting the restart block is a very juicy exploit target. This is because the restart_block is held in the same memory allocation as the kernel stack. Moving the restart block to struct task_struct prevents this exploit by making the restart_block harder to locate. Note that there are other fields in thread_info that are also easy targets, at least on some architectures. It's also a decent simplification, since the restart code is more or less identical on all architectures. [james.hogan@imgtec.com: metag: align thread_info::supervisor_stack] Signed-off-by: NAndy Lutomirski <luto@amacapital.net> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Kees Cook <keescook@chromium.org> Cc: David Miller <davem@davemloft.net> Acked-by: NRichard Weinberger <richard@nod.at> Cc: Richard Henderson <rth@twiddle.net> Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Cc: Matt Turner <mattst88@gmail.com> Cc: Vineet Gupta <vgupta@synopsys.com> Cc: Russell King <rmk@arm.linux.org.uk> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Haavard Skinnemoen <hskinnemoen@gmail.com> Cc: Hans-Christian Egtvedt <egtvedt@samfundet.no> Cc: Steven Miao <realmz6@gmail.com> Cc: Mark Salter <msalter@redhat.com> Cc: Aurelien Jacquiot <a-jacquiot@ti.com> Cc: Mikael Starvik <starvik@axis.com> Cc: Jesper Nilsson <jesper.nilsson@axis.com> Cc: David Howells <dhowells@redhat.com> Cc: Richard Kuo <rkuo@codeaurora.org> Cc: "Luck, Tony" <tony.luck@intel.com> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Michal Simek <monstr@monstr.eu> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Jonas Bonn <jonas@southpole.se> Cc: "James E.J. Bottomley" <jejb@parisc-linux.org> Cc: Helge Deller <deller@gmx.de> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Acked-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc) Tested-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc) Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Chen Liqin <liqin.linux@gmail.com> Cc: Lennox Wu <lennox.wu@gmail.com> Cc: Chris Metcalf <cmetcalf@ezchip.com> Cc: Guan Xuetao <gxt@mprc.pku.edu.cn> Cc: Chris Zankel <chris@zankel.net> Cc: Max Filippov <jcmvbkbc@gmail.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Guenter Roeck <linux@roeck-us.net> Signed-off-by: NJames Hogan <james.hogan@imgtec.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 12 2月, 2015 2 次提交
-
-
由 Kirill A. Shutemov 提交于
LKP has triggered a compiler warning after my recent patch "mm: account pmd page tables to the process": mm/mmap.c: In function 'exit_mmap': >> mm/mmap.c:2857:2: warning: right shift count >= width of type [enabled by default] The code: > 2857 WARN_ON(mm_nr_pmds(mm) > 2858 round_up(FIRST_USER_ADDRESS, PUD_SIZE) >> PUD_SHIFT); In this, on tile, we have FIRST_USER_ADDRESS defined as 0. round_up() has the same type -- int. PUD_SHIFT. I think the best way to fix it is to define FIRST_USER_ADDRESS as unsigned long. On every arch for consistency. Signed-off-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com> Reported-by: NWu Fengguang <fengguang.wu@intel.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Naoya Horiguchi 提交于
Currently we have many duplicates in definitions around follow_huge_addr(), follow_huge_pmd(), and follow_huge_pud(), so this patch tries to remove the m. The basic idea is to put the default implementation for these functions in mm/hugetlb.c as weak symbols (regardless of CONFIG_ARCH_WANT_GENERAL_HUGETL B), and to implement arch-specific code only when the arch needs it. For follow_huge_addr(), only powerpc and ia64 have their own implementation, and in all other architectures this function just returns ERR_PTR(-EINVAL). So this patch sets returning ERR_PTR(-EINVAL) as default. As for follow_huge_(pmd|pud)(), if (pmd|pud)_huge() is implemented to always return 0 in your architecture (like in ia64 or sparc,) it's never called (the callsite is optimized away) no matter how implemented it is. So in such architectures, we don't need arch-specific implementation. In some architecture (like mips, s390 and tile,) their current arch-specific follow_huge_(pmd|pud)() are effectively identical with the common code, so this patch lets these architecture use the common code. One exception is metag, where pmd_huge() could return non-zero but it expects follow_huge_pmd() to always return NULL. This means that we need arch-specific implementation which returns NULL. This behavior looks strange to me (because non-zero pmd_huge() implies that the architecture supports PMD-based hugepage, so follow_huge_pmd() can/should return some relevant value,) but that's beyond this cleanup patch, so let's keep it. Justification of non-trivial changes: - in s390, follow_huge_pmd() checks !MACHINE_HAS_HPAGE at first, and this patch removes the check. This is OK because we can assume MACHINE_HAS_HPAGE is true when follow_huge_pmd() can be called (note that pmd_huge() has the same check and always returns 0 for !MACHINE_HAS_HPAGE.) - in s390 and mips, we use HPAGE_MASK instead of PMD_MASK as done in common code. This patch forces these archs use PMD_MASK, but it's OK because they are identical in both archs. In s390, both of HPAGE_SHIFT and PMD_SHIFT are 20. In mips, HPAGE_SHIFT is defined as (PAGE_SHIFT + PAGE_SHIFT - 3) and PMD_SHIFT is define as (PAGE_SHIFT + PAGE_SHIFT + PTE_ORDER - 3), but PTE_ORDER is always 0, so these are identical. Signed-off-by: NNaoya Horiguchi <n-horiguchi@ah.jp.nec.com> Acked-by: NHugh Dickins <hughd@google.com> Cc: James Hogan <james.hogan@imgtec.com> Cc: David Rientjes <rientjes@google.com> Cc: Mel Gorman <mel@csn.ul.ie> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Michal Hocko <mhocko@suse.cz> Cc: Rik van Riel <riel@redhat.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Luiz Capitulino <lcapitulino@redhat.com> Cc: Nishanth Aravamudan <nacc@linux.vnet.ibm.com> Cc: Lee Schermerhorn <lee.schermerhorn@hp.com> Cc: Steve Capper <steve.capper@linaro.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 11 2月, 2015 1 次提交
-
-
由 Kirill A. Shutemov 提交于
We've replaced remap_file_pages(2) implementation with emulation. Nobody creates non-linear mapping anymore. This patch also adjust __SWP_TYPE_SHIFT and increase number of bits availble for swap offset. Signed-off-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com> Acked-by: NCatalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 06 2月, 2015 1 次提交
-
-
由 Paolo Bonzini 提交于
This patch introduces a new module parameter for the KVM module; when it is present, KVM attempts a bit of polling on every HLT before scheduling itself out via kvm_vcpu_block. This parameter helps a lot for latency-bound workloads---in particular I tested it with O_DSYNC writes with a battery-backed disk in the host. In this case, writes are fast (because the data doesn't have to go all the way to the platters) but they cannot be merged by either the host or the guest. KVM's performance here is usually around 30% of bare metal, or 50% if you use cache=directsync or cache=writethrough (these parameters avoid that the guest sends pointless flush requests, and at the same time they are not slow because of the battery-backed cache). The bad performance happens because on every halt the host CPU decides to halt itself too. When the interrupt comes, the vCPU thread is then migrated to a new physical CPU, and in general the latency is horrible because the vCPU thread has to be scheduled back in. With this patch performance reaches 60-65% of bare metal and, more important, 99% of what you get if you use idle=poll in the guest. This means that the tunable gets rid of this particular bottleneck, and more work can be done to improve performance in the kernel or QEMU. Of course there is some price to pay; every time an otherwise idle vCPUs is interrupted by an interrupt, it will poll unnecessarily and thus impose a little load on the host. The above results were obtained with a mostly random value of the parameter (500000), and the load was around 1.5-2.5% CPU usage on one of the host's core for each idle guest vCPU. The patch also adds a new stat, /sys/kernel/debug/kvm/halt_successful_poll, that can be used to tune the parameter. It counts how many HLT instructions received an interrupt during the polling period; each successful poll avoids that Linux schedules the VCPU thread out and back in, and may also avoid a likely trip to C1 and back for the physical CPU. While the VM is idle, a Linux 4 VCPU VM halts around 10 times per second. Of these halts, almost all are failed polls. During the benchmark, instead, basically all halts end within the polling period, except a more or less constant stream of 50 per second coming from vCPUs that are not running the benchmark. The wasted time is thus very low. Things may be slightly different for Windows VMs, which have a ~10 ms timer tick. The effect is also visible on Marcelo's recently-introduced latency test for the TSC deadline timer. Though of course a non-RT kernel has awful latency bounds, the latency of the timer is around 8000-10000 clock cycles compared to 20000-120000 without setting halt_poll_ns. For the TSC deadline timer, thus, the effect is both a smaller average latency and a smaller variance. Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-