- 08 1月, 2014 5 次提交
-
-
由 Jiang Liu 提交于
Optimize jump label implementation for ARM64 by dynamically patching kernel text. Reviewed-by: NWill Deacon <will.deacon@arm.com> Signed-off-by: NJiang Liu <liuj97@gmail.com> Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
-
由 Jiang Liu 提交于
Introduce aarch64_insn_gen_{nop|branch_imm}() helper functions, which will be used to implement jump label on ARM64. Reviewed-by: NWill Deacon <will.deacon@arm.com> Signed-off-by: NJiang Liu <liuj97@gmail.com> Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
-
由 Jiang Liu 提交于
Function encode_insn_immediate() will be used by other instruction manipulate related functions, so move it into insn.c and rename it as aarch64_insn_encode_immediate(). Reviewed-by: NWill Deacon <will.deacon@arm.com> Signed-off-by: NJiang Liu <liuj97@gmail.com> Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
-
由 Jiang Liu 提交于
Introduce three interfaces to patch kernel and module code: aarch64_insn_patch_text_nosync(): patch code without synchronization, it's caller's responsibility to synchronize all CPUs if needed. aarch64_insn_patch_text_sync(): patch code and always synchronize with stop_machine() aarch64_insn_patch_text(): patch code and synchronize with stop_machine() if needed Reviewed-by: NWill Deacon <will.deacon@arm.com> Signed-off-by: NJiang Liu <liuj97@gmail.com> Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
-
由 Jiang Liu 提交于
Introduce basic aarch64 instruction decoding helper aarch64_get_insn_class() and aarch64_insn_hotpatch_safe(). Reviewed-by: NWill Deacon <will.deacon@arm.com> Signed-off-by: NJiang Liu <liuj97@gmail.com> Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
-
- 20 12月, 2013 9 次提交
-
-
由 Laura Abbott 提交于
arm64 bit targets need the features CMA provides. Add the appropriate hooks, header files, and Kconfig to allow this to happen. Cc: Will Deacon <will.deacon@arm.com> Cc: Marek Szyprowski <m.szyprowski@samsung.com> Signed-off-by: NLaura Abbott <lauraa@codeaurora.org> Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
-
由 Ard Biesheuvel 提交于
asm/cputype.h contains a bunch of #defines for CPU id registers that essentially map to themselves. Remove the #defines and pass the tokens directly to the inline asm() that reads the registers. Signed-off-by: NArd Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
-
由 Mark Hambleton 提交于
Make sure the value we are going to return is referenced in order to avoid warnings from newer GCCs such as: arch/arm64/include/asm/cmpxchg.h:162:3: warning: value computed is not used [-Wunused-value] ((__typeof__(*(ptr)))__cmpxchg_mb((ptr), \ ^ net/netfilter/nf_conntrack_core.c:674:2: note: in expansion of macro ‘cmpxchg’ cmpxchg(&nf_conntrack_hash_rnd, 0, rand); [Modified to use the current underlying implementation as current mainline for both cmpxchg() and cmpxchg_local() does -- broonie] Signed-off-by: NMark Hambleton <mahamble@broadcom.com> Signed-off-by: NMark Brown <broonie@linaro.org> Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
-
由 Sandeepa Prabhu 提交于
AArch64 Single Steping and Breakpoint debug exceptions will be used by multiple debug framworks like kprobes & kgdb. This patch implements the hooks for those frameworks to register their own handlers for handling breakpoint and single step events. Reworked the debug exception handler in entry.S: do_dbg to route software breakpoint (BRK64) exception to do_debug_exception() Signed-off-by: NSandeepa Prabhu <sandeepa.prabhu@linaro.org> Signed-off-by: NDeepak Saxena <dsaxena@linaro.org> Acked-by: NWill Deacon <will.deacon@arm.com> Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
-
由 Will Deacon 提交于
DCACHE_WORD_ACCESS uses the word-at-a-time API for optimised string comparisons in the vfs layer. This patch implements support for load_unaligned_zeropad in much the same way as has been done for ARM, although big-endian systems are also supported. Signed-off-by: NWill Deacon <will.deacon@arm.com> Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
-
由 Will Deacon 提交于
AArch64 instructions must be 4-byte aligned, so make sure this is true for the futex .fixup section. Signed-off-by: NWill Deacon <will.deacon@arm.com> Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
-
由 Will Deacon 提交于
This patch implements the word-at-a-time interface for arm64 using the same algorithm as ARM. We use the fls64 macro, which expands to a clz instruction via a compiler builtin. Big-endian configurations make use of the implementation from asm-generic. With this implemented, we can replace our byte-at-a-time strnlen_user and strncpy_from_user functions with the optimised generic versions. Signed-off-by: NWill Deacon <will.deacon@arm.com> Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
-
由 Will Deacon 提交于
This patch implements optimised percpu variable accesses using the el1 r/w thread register (tpidr_el1) along the same lines as arch/arm/. Signed-off-by: NWill Deacon <will.deacon@arm.com> Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
-
由 Laura Abbott 提交于
The definition of virt_addr_valid is that virt_addr_valid should return true if and only if virt_to_page returns a valid pointer. The current definition of virt_addr_valid only checks against the virtual address range. There's no guarantee that just because a virtual address falls bewteen PAGE_OFFSET and high_memory the associated physical memory has a valid backing struct page. Follow the example of other architectures and convert to pfn_valid to verify that the virtual address is actually valid. Cc: Will Deacon <will.deacon@arm.com> Cc: Nicolas Pitre <nico@linaro.org> Signed-off-by: NLaura Abbott <lauraa@codeaurora.org> Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
-
- 17 12月, 2013 5 次提交
-
-
由 Lorenzo Pieralisi 提交于
On platforms with power management capabilities, timers that are shut down when a CPU enters deep C-states must be emulated using an always-on timer and a timer IPI to relay the timer IRQ to target CPUs on an SMP system. This patch enables the generic clockevents broadcast infrastructure for arm64, by providing the required Kconfig entries and adding the timer IPI infrastructure. Acked-by: NDaniel Lezcano <daniel.lezcano@linaro.org> Signed-off-by: NLorenzo Pieralisi <lorenzo.pieralisi@arm.com>
-
由 Lorenzo Pieralisi 提交于
Kernel subsystems like CPU idle and suspend to RAM require a generic mechanism to suspend a processor, save its context and put it into a quiescent state. The cpu_{suspend}/{resume} implementation provides such a framework through a kernel interface allowing to save/restore registers, flush the context to DRAM and suspend/resume to/from low-power states where processor context may be lost. The CPU suspend implementation relies on the suspend protocol registered in CPU operations to carry out a suspend request after context is saved and flushed to DRAM. The cpu_suspend interface: int cpu_suspend(unsigned long arg); allows to pass an opaque parameter that is handed over to the suspend CPU operations back-end so that it can take action according to the semantics attached to it. The arg parameter allows suspend to RAM and CPU idle drivers to communicate to suspend protocol back-ends; it requires standardization so that the interface can be reused seamlessly across systems, paving the way for generic drivers. Context memory is allocated on the stack, whose address is stashed in a per-cpu variable to keep track of it and passed to core functions that save/restore the registers required by the architecture. Even though, upon successful execution, the cpu_suspend function shuts down the suspending processor, the warm boot resume mechanism, based on the cpu_resume function, makes the resume path operate as a cpu_suspend function return, so that cpu_suspend can be treated as a C function by the caller, which simplifies coding the PM drivers that rely on the cpu_suspend API. Upon context save, the minimal amount of memory is flushed to DRAM so that it can be retrieved when the MMU is off and caches are not searched. The suspend CPU operation, depending on the required operations (eg CPU vs Cluster shutdown) is in charge of flushing the cache hierarchy either implicitly (by calling firmware implementations like PSCI) or explicitly by executing the required cache maintainance functions. Debug exceptions are disabled during cpu_{suspend}/{resume} operations so that debug registers can be saved and restored properly preventing preemption from debug agents enabled in the kernel. Signed-off-by: NLorenzo Pieralisi <lorenzo.pieralisi@arm.com>
-
由 Lorenzo Pieralisi 提交于
Power management software requires the kernel to save and restore CPU registers while going through suspend and resume operations triggered by kernel subsystems like CPU idle and suspend to RAM. This patch implements code that provides save and restore mechanism for the arm v8 implementation. Memory for the context is passed as parameter to both cpu_do_suspend and cpu_do_resume functions, and allows the callers to implement context allocation as they deem fit. The registers that are saved and restored correspond to the registers set actually required by the kernel to be up and running which represents a subset of v8 ISA. Signed-off-by: NLorenzo Pieralisi <lorenzo.pieralisi@arm.com>
-
由 Lorenzo Pieralisi 提交于
On ARM64 SMP systems, cores are identified by their MPIDR_EL1 register. The MPIDR_EL1 guidelines in the ARM ARM do not provide strict enforcement of MPIDR_EL1 layout, only recommendations that, if followed, split the MPIDR_EL1 on ARM 64 bit platforms in four affinity levels. In multi-cluster systems like big.LITTLE, if the affinity guidelines are followed, the MPIDR_EL1 can not be considered a linear index. This means that the association between logical CPU in the kernel and the HW CPU identifier becomes somewhat more complicated requiring methods like hashing to associate a given MPIDR_EL1 to a CPU logical index, in order for the look-up to be carried out in an efficient and scalable way. This patch provides a function in the kernel that starting from the cpu_logical_map, implement collision-free hashing of MPIDR_EL1 values by checking all significative bits of MPIDR_EL1 affinity level bitfields. The hashing can then be carried out through bits shifting and ORing; the resulting hash algorithm is a collision-free though not minimal hash that can be executed with few assembly instructions. The mpidr_el1 is filtered through a mpidr mask that is built by checking all bits that toggle in the set of MPIDR_EL1s corresponding to possible CPUs. Bits that do not toggle do not carry information so they do not contribute to the resulting hash. Pseudo code: /* check all bits that toggle, so they are required */ for (i = 1, mpidr_el1_mask = 0; i < num_possible_cpus(); i++) mpidr_el1_mask |= (cpu_logical_map(i) ^ cpu_logical_map(0)); /* * Build shifts to be applied to aff0, aff1, aff2, aff3 values to hash the * mpidr_el1 * fls() returns the last bit set in a word, 0 if none * ffs() returns the first bit set in a word, 0 if none */ fs0 = mpidr_el1_mask[7:0] ? ffs(mpidr_el1_mask[7:0]) - 1 : 0; fs1 = mpidr_el1_mask[15:8] ? ffs(mpidr_el1_mask[15:8]) - 1 : 0; fs2 = mpidr_el1_mask[23:16] ? ffs(mpidr_el1_mask[23:16]) - 1 : 0; fs3 = mpidr_el1_mask[39:32] ? ffs(mpidr_el1_mask[39:32]) - 1 : 0; ls0 = fls(mpidr_el1_mask[7:0]); ls1 = fls(mpidr_el1_mask[15:8]); ls2 = fls(mpidr_el1_mask[23:16]); ls3 = fls(mpidr_el1_mask[39:32]); bits0 = ls0 - fs0; bits1 = ls1 - fs1; bits2 = ls2 - fs2; bits3 = ls3 - fs3; aff0_shift = fs0; aff1_shift = 8 + fs1 - bits0; aff2_shift = 16 + fs2 - (bits0 + bits1); aff3_shift = 32 + fs3 - (bits0 + bits1 + bits2); u32 hash(u64 mpidr_el1) { u32 l[4]; u64 mpidr_el1_masked = mpidr_el1 & mpidr_el1_mask; l[0] = mpidr_el1_masked & 0xff; l[1] = mpidr_el1_masked & 0xff00; l[2] = mpidr_el1_masked & 0xff0000; l[3] = mpidr_el1_masked & 0xff00000000; return (l[0] >> aff0_shift | l[1] >> aff1_shift | l[2] >> aff2_shift | l[3] >> aff3_shift); } The hashing algorithm relies on the inherent properties set in the ARM ARM recommendations for the MPIDR_EL1. Exotic configurations, where for instance the MPIDR_EL1 values at a given affinity level have large holes, can end up requiring big hash tables since the compression of values that can be achieved through shifting is somewhat crippled when holes are present. Kernel warns if the number of buckets of the resulting hash table exceeds the number of possible CPUs by a factor of 4, which is a symptom of a very sparse HW MPIDR_EL1 configuration. The hash algorithm is quite simple and can easily be implemented in assembly code, to be used in code paths where the kernel virtual address space is not set-up (ie cpu_resume) and instruction and data fetches are strongly ordered so code must be compact and must carry out few data accesses. Signed-off-by: NLorenzo Pieralisi <lorenzo.pieralisi@arm.com>
-
由 Lorenzo Pieralisi 提交于
In order to simplify access to different affinity levels within the MPIDR_EL1 register values, this patch implements some preprocessor macros that allow to retrieve the MPIDR_EL1 affinity level value according to the level passed as input parameter. Reviewed-by: NWill Deacon <will.deacon@arm.com> Signed-off-by: NLorenzo Pieralisi <lorenzo.pieralisi@arm.com>
-
- 07 12月, 2013 2 次提交
-
-
由 Steve Capper 提交于
Modify the value of PMD_SECT_PROT_NONE to match that of PTE_NONE. This should have been in commit 3676f9ef (Move PTE_PROT_NONE higher up). Signed-off-by: NSteve Capper <steve.capper@linaro.org> Cc: <stable@vger.kernel.org> # 3.11+: 3676f9ef: arm64: Move PTE_PROT_NONE higher up Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
-
由 Catalin Marinas 提交于
Write-combine and cacheable mappings use Normal memory on arm64. On SMP systems, the pte needs the shareability bit which is set in pgprot_default. Use this for defining PROT_DEFAULT used by ioremap_wc and ioremap_cache (Device memory is shareable by default, does not need additional attributes). Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
-
- 29 11月, 2013 2 次提交
-
-
由 Catalin Marinas 提交于
PTE_PROT_NONE means that a pte is present but does not have any read/write attributes. However, setting the memory type like pgprot_writecombine() is allowed and such bits overlap with PTE_PROT_NONE. This causes mmap/munmap issues in drivers that change the vma->vm_pg_prot on PROT_NONE mappings. This patch reverts the PTE_FILE/PTE_PROT_NONE shift in commit 59911ca4 (ARM64: mm: Move PTE_PROT_NONE bit) and moves PTE_PROT_NONE together with the other software bits. Signed-off-by: NSteve Capper <steve.capper@linaro.org> Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com> Tested-by: NSteve Capper <steve.capper@linaro.org> Cc: <stable@vger.kernel.org> # 3.11+
-
由 Catalin Marinas 提交于
This provides better performance compared to Device GRE and also allows unaligned accesses. Such memory is intended to be used with standard RAM (e.g. framebuffers) and not I/O. Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
-
- 26 11月, 2013 1 次提交
-
-
由 Catalin Marinas 提交于
The asynchronous aborts are generally fatal for the kernel but they can be masked via the pstate A bit. If a system error happens while in kernel mode, it won't be visible until returning to user space. This patch enables this kind of abort early to help identifying the cause. Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
-
- 15 11月, 2013 1 次提交
-
-
由 Kirill A. Shutemov 提交于
Signed-off-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 14 11月, 2013 1 次提交
-
-
由 Thomas Gleixner 提交于
No point in having this bit defined by architecture. Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Acked-by: NPeter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20130917183629.090698799@linutronix.de
-
- 09 11月, 2013 2 次提交
-
-
由 Chen Gang 提交于
In current kernel wide source code, except other architectures, only s390 scsi drivers use atomic_clear_mask(), and arm/arm64 need not support s390 drivers. So remove atomic_clear_mask() from "arm[64]/include/asm/atomic.h". Signed-off-by: NChen Gang <gang.chen@asianux.com> Signed-off-by: NWill Deacon <will.deacon@arm.com> Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>
-
由 Stefano Stabellini 提交于
Signed-off-by: NStefano Stabellini <stefano.stabellini@eu.citrix.com> Acked-by: NOlof Johansson <olof@lixom.net> Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
-
- 08 11月, 2013 2 次提交
-
-
由 Marc Zyngier 提交于
When booting a vcpu using PSCI, make sure we start it with the endianness of the caller. Otherwise, secondaries can be pretty unhappy to execute a BE kernel in LE mode... This conforms to PSCI spec Rev B, 5.13.3. Acked-by: NChristoffer Dall <christoffer.dall@linaro.org> Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
-
由 Marc Zyngier 提交于
Do the necessary byteswap when host and guest have different views of the universe. Actually, the only case we need to take care of is when the guest is BE. All the other cases are naturally handled. Also be careful about endianness when the data is being memcopy-ed from/to the run buffer. Acked-by: NChristoffer Dall <christoffer.dall@linaro.org> Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
-
- 06 11月, 2013 1 次提交
-
-
由 Catalin Marinas 提交于
This patch expands the VA_BITS to 42 when the 64K page configuration is enabled allowing 2TB kernel linear mapping. Linux still uses 2 levels of page tables in this configuration with pgd now being a full page. Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com> Acked-by: NWill Deacon <will.deacon@arm.com> Acked-by: NMarc Zyngier <marc.zyngier@arm.com>
-
- 30 10月, 2013 2 次提交
-
-
由 Mark Salter 提交于
Some drivers (ACPI notably) use ioremap_cache() to map an area which could either be outside of kernel RAM or in an already mapped reserved area of RAM. To avoid aliases with different caching attributes, ioremap() does not allow RAM to be remapped. But for ioremap_cache(), the existing kernel mapping may be used. Signed-off-by: NMark Salter <msalter@redhat.com> Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
-
由 Marc Zyngier 提交于
On an (even slightly) oversubscribed system, spinlocks are quickly becoming a bottleneck, as some vcpus are spinning, waiting for a lock to be released, while the vcpu holding the lock may not be running at all. The solution is to trap blocking WFEs and tell KVM that we're now spinning. This ensures that other vpus will get a scheduling boost, allowing the lock to be released more quickly. Also, using CONFIG_HAVE_KVM_CPU_RELAX_INTERCEPT slightly improves the performance when the VM is severely overcommited. Acked-by: NChristoffer Dall <christoffer.dall@linaro.org> Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
-
- 25 10月, 2013 7 次提交
-
-
由 Catalin Marinas 提交于
The owner and next members of the arch_spinlock_t structure need to be swapped when compiling for big endian. Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com> Reported-by: NMatthew Leach <matthew.leach@arm.com> Acked-by: NWill Deacon <will.deacon@arm.com>
-
由 Matthew Leach 提交于
Currently, the code for setting the __cpu_boot_mode flag is munged in with el2_setup. This makes things difficult on a BE bringup as a memory access has to have occurred before el2_setup which is the place that we'd like to set the endianess on the current EL. Create a new function for setting __cpu_boot_mode and have el2_setup return the mode the CPU. Also define a new constant in virt.h, BOOT_CPU_MODE_EL1, for readability. Acked-by: NMarc Zyngier <marc.zyngier@arm.com> Acked-by: NWill Deacon <will.deacon@arm.com> Signed-off-by: NMatthew Leach <matthew.leach@arm.com> Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
-
由 Matthew Leach 提交于
Add CPU_LE and CPU_BE to select assembler code in little and big endian configurations respectively. Signed-off-by: NMatthew Leach <matthew.leach@arm.com> Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
-
由 Matthew Leach 提交于
The arm64 port contains wrappers for arm32 syscalls that pass 64-bit values. These wrappers concatenate the two registers to hold a 64-bit value in a single X register. On BE, however, the lower and higher words are swapped. Create a new assembler macro, regs_to_64, that when on BE systems swaps the registers in the orr instruction. Acked-by: NWill Deacon <will.deacon@arm.com> Signed-off-by: NMatthew Leach <matthew.leach@arm.com> Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
-
由 Will Deacon 提交于
This patch adds support for BE8 AArch32 tasks to the compat layer. Signed-off-by: NWill Deacon <will.deacon@arm.com> Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
-
由 Will Deacon 提交于
This patch adds support for the aarch64_be ELF format to the AArch64 ELF loader. Signed-off-by: NWill Deacon <will.deacon@arm.com> Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
-
由 Stefano Stabellini 提交于
This is similar to what it is done on X86: biovecs are prevented from merging otherwise every dma requests would be forced to bounce on the swiotlb buffer. Signed-off-by: NStefano Stabellini <stefano.stabellini@eu.citrix.com> Acked-by: NCatalin Marinas <catalin.marinas@arm.com> Changes in v7: - remove the extra autotranslate check in biomerge.c.
-