- 14 9月, 2015 1 次提交
-
-
由 Catalin Marinas 提交于
Commit 2f4b829c ("arm64: Add support for hardware updates of the access and dirty pte bits") introduced support for handling hardware updates of the access flag and dirty status. The PTE is automatically dirtied in hardware (if supported) by clearing the PTE_RDONLY bit when the PTE_DBM/PTE_WRITE bit is set. The pte_hw_dirty() macro was added to detect a hardware dirtied pte. The pte_dirty() macro checks for both software PTE_DIRTY and pte_hw_dirty(). Functions like pte_modify() clear the PTE_RDONLY bit since it is meant to be set in set_pte_at() when written to memory. In such cases, pte_hw_dirty() would return true even though such pte is clean. This patch changes pte_hw_dirty() to test the PTE_DBM/PTE_WRITE bit together with PTE_RDONLY. Fixes: 2f4b829c ("arm64: Add support for hardware updates of the access and dirty pte bits") Reported-by: NJulien Grall <julien.grall@citrix.com> Tested-by: NJulien Grall <julien.grall@citrix.com> Tested-by: NWill Deacon <will.deacon@arm.com> Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com> Signed-off-by: NWill Deacon <will.deacon@arm.com>
-
- 11 9月, 2015 5 次提交
-
-
由 Christoph Hellwig 提交于
Almost everyone implements dma_set_mask the same way, although some time that's hidden in ->set_dma_mask methods. This patch consolidates those into a common implementation that either calls ->set_dma_mask if present or otherwise uses the default implementation. Some architectures used to only call ->set_dma_mask after the initial checks, and those instance have been fixed to do the full work. h8300 implemented dma_set_mask bogusly as a no-ops and has been fixed. Unfortunately some architectures overload unrelated semantics like changing the dma_ops into it so we still need to allow for an architecture override for now. [jcmvbkbc@gmail.com: fix xtensa] Signed-off-by: NChristoph Hellwig <hch@lst.de> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Russell King <linux@arm.linux.org.uk> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Cc: Michal Simek <monstr@monstr.eu> Cc: Jonas Bonn <jonas@southpole.se> Cc: Chris Metcalf <cmetcalf@ezchip.com> Cc: Guan Xuetao <gxt@mprc.pku.edu.cn> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Andy Shevchenko <andy.shevchenko@gmail.com> Signed-off-by: NMax Filippov <jcmvbkbc@gmail.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Christoph Hellwig 提交于
Most architectures just call into ->dma_supported, but some also return 1 if the method is not present, or 0 if no dma ops are present (although that should never happeb). Consolidate this more broad version into common code. Also fix h8300 which inorrectly always returned 0, which would have been a problem if it's dma_set_mask implementation wasn't a similarly buggy noop. As a few architectures have much more elaborate implementations, we still allow for arch overrides. [jcmvbkbc@gmail.com: fix xtensa] Signed-off-by: NChristoph Hellwig <hch@lst.de> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Russell King <linux@arm.linux.org.uk> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Cc: Michal Simek <monstr@monstr.eu> Cc: Jonas Bonn <jonas@southpole.se> Cc: Chris Metcalf <cmetcalf@ezchip.com> Cc: Guan Xuetao <gxt@mprc.pku.edu.cn> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Andy Shevchenko <andy.shevchenko@gmail.com> Signed-off-by: NMax Filippov <jcmvbkbc@gmail.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Christoph Hellwig 提交于
Currently there are three valid implementations of dma_mapping_error: (1) call ->mapping_error (2) check for a hardcoded error code (3) always return 0 This patch provides a common implementation that calls ->mapping_error if present, then checks for DMA_ERROR_CODE if defined or otherwise returns 0. [jcmvbkbc@gmail.com: fix xtensa] Signed-off-by: NChristoph Hellwig <hch@lst.de> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Russell King <linux@arm.linux.org.uk> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Cc: Michal Simek <monstr@monstr.eu> Cc: Jonas Bonn <jonas@southpole.se> Cc: Chris Metcalf <cmetcalf@ezchip.com> Cc: Guan Xuetao <gxt@mprc.pku.edu.cn> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Andy Shevchenko <andy.shevchenko@gmail.com> Signed-off-by: NMax Filippov <jcmvbkbc@gmail.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Christoph Hellwig 提交于
Most architectures do not support non-coherent allocations and either define dma_{alloc,free}_noncoherent to their coherent versions or stub them out. Openrisc uses dma_{alloc,free}_attrs to implement them, and only Mips implements them directly. This patch moves the Openrisc version to common code, and handles the DMA_ATTR_NON_CONSISTENT case in the mips dma_map_ops instance. Note that actual non-coherent allocations require a dma_cache_sync implementation, so if non-coherent allocations didn't work on an architecture before this patch they still won't work after it. [jcmvbkbc@gmail.com: fix xtensa] Signed-off-by: NChristoph Hellwig <hch@lst.de> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Russell King <linux@arm.linux.org.uk> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Cc: Michal Simek <monstr@monstr.eu> Cc: Jonas Bonn <jonas@southpole.se> Cc: Chris Metcalf <cmetcalf@ezchip.com> Cc: Guan Xuetao <gxt@mprc.pku.edu.cn> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Andy Shevchenko <andy.shevchenko@gmail.com> Signed-off-by: NMax Filippov <jcmvbkbc@gmail.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Christoph Hellwig 提交于
Since 2009 we have a nice asm-generic header implementing lots of DMA API functions for architectures using struct dma_map_ops, but unfortunately it's still missing a lot of APIs that all architectures still have to duplicate. This series consolidates the remaining functions, although we still need arch opt outs for two of them as a few architectures have very non-standard implementations. This patch (of 5): The coherent DMA allocator works the same over all architectures supporting dma_map operations. This patch consolidates them and converges the minor differences: - the debug_dma helpers are now called from all architectures, including those that were previously missing them - dma_alloc_from_coherent and dma_release_from_coherent are now always called from the generic alloc/free routines instead of the ops dma-mapping-common.h always includes dma-coherent.h to get the defintions for them, or the stubs if the architecture doesn't support this feature - checks for ->alloc / ->free presence are removed. There is only one magic instead of dma_map_ops without them (mic_dma_ops) and that one is x86 only anyway. Besides that only x86 needs special treatment to replace a default devices if none is passed and tweak the gfp_flags. An optional arch hook is provided for that. [linux@roeck-us.net: fix build] [jcmvbkbc@gmail.com: fix xtensa] Signed-off-by: NChristoph Hellwig <hch@lst.de> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Russell King <linux@arm.linux.org.uk> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Cc: Michal Simek <monstr@monstr.eu> Cc: Jonas Bonn <jonas@southpole.se> Cc: Chris Metcalf <cmetcalf@ezchip.com> Cc: Guan Xuetao <gxt@mprc.pku.edu.cn> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Andy Shevchenko <andy.shevchenko@gmail.com> Signed-off-by: NGuenter Roeck <linux@roeck-us.net> Signed-off-by: NMax Filippov <jcmvbkbc@gmail.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 28 8月, 2015 1 次提交
-
-
由 Christoph Hellwig 提交于
Three architectures already define these, and we'll need them genericly soon. Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NDan Williams <dan.j.williams@intel.com>
-
- 24 8月, 2015 2 次提交
-
-
由 Ard Biesheuvel 提交于
The linear region size of a 39-bit VA kernel is only 256 GB, which may be insufficient to cover all of system RAM, even on platforms that have much less than 256 GB of memory but which is laid out very sparsely. So make sure we clip the memory we will not be able to map before installing it into the memblock memory table, by setting MAX_MEMBLOCK_ADDR accordingly. Reviewed-by: NCatalin Marinas <catalin.marinas@arm.com> Tested-by: NStuart Yoder <stuart.yoder@freescale.com> Signed-off-by: NArd Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: NWill Deacon <will.deacon@arm.com>
-
由 Alexander Kuleshov 提交于
Architecture specific code for i386 and x86_64 was unified and merged to the arch/x86. This patch fix old path of x86 architecture in a comment from the arch/arm64/include/asm/fixmap.h. Signed-off-by: NAlexander Kuleshov <kuleshovmail@gmail.com> Signed-off-by: NWill Deacon <will.deacon@arm.com>
-
- 20 8月, 2015 2 次提交
-
-
由 Julien Grall 提交于
Currently, the event channel rebind code is gated with the presence of the vector callback. The virtual interrupt controller on ARM has the concept of per-CPU interrupt (PPI) which allow us to support per-VCPU event channel. Therefore there is no need of vector callback for ARM. Xen is already using a free PPI to notify the guest VCPU of an event. Furthermore, the xen code initialization in Linux (see arch/arm/xen/enlighten.c) is requesting correctly a per-CPU IRQ. Introduce new helper xen_support_evtchn_rebind to allow architecture decide whether rebind an event is support or not. It will always return true on ARM and keep the same behavior on x86. This is also allow us to drop the usage of xen_have_vector_callback entirely in the ARM code. Signed-off-by: NJulien Grall <julien.grall@citrix.com> Signed-off-by: NDavid Vrabel <david.vrabel@citrix.com>
-
由 Mario Smarduch 提交于
This patch only saves and restores FP/SIMD registers on Guest access. To do this cptr_el2 FP/SIMD trap is set on Guest entry and later checked on exit. lmbench, hackbench show significant improvements, for 30-50% exits FP/SIMD context is not saved/restored [chazy/maz: fixed save/restore logic for 32bit guests] Signed-off-by: NMario Smarduch <m.smarduch@samsung.com> Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org> Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
-
- 13 8月, 2015 1 次提交
-
-
由 Jungseok Lee 提交于
The gic_handle_irq() is defined with __exception_irq_entry attribute. A single remaining work is to add its definition as ARM did. Below shows how function graph data is changed with these hunks. A prologue of an interrupt handler is drawn as follows. - current status 0) 0.208 us | cpuidle_not_available(); 0) | default_idle_call() { 0) | arch_cpu_idle() { 0) | __handle_domain_irq() { 0) | irq_enter() { 0) 0.313 us | rcu_irq_enter(); 0) 0.261 us | __local_bh_disable_ip(); - with this change 0) 0.625 us | cpuidle_not_available(); 0) | default_idle_call() { 0) | arch_cpu_idle() { 0) ==========> | 0) | gic_handle_irq() { 0) | __handle_domain_irq() { 0) | irq_enter() { 0) 0.885 us | rcu_irq_enter(); 0) 0.781 us | __local_bh_disable_ip(); An epilogue of an interrupt handler is recorded as follows. - current status 0) 0.261 us | idle_cpu(); 0) | rcu_irq_exit() { 0) 0.521 us | rcu_eqs_enter_common.isra.46(); 0) 2.552 us | } 0) ! 322.448 us | } 0) ! 583.437 us | } 0) # 1656.041 us | } 0) # 1658.073 us | } - with this change 0) 0.677 us | idle_cpu(); 0) | rcu_irq_exit() { 0) 1.770 us | rcu_eqs_enter_common.isra.46(); 0) 7.968 us | } 0) # 1803.541 us | } 0) # 2626.667 us | } 0) # 2632.969 us | } 0) <========== | 0) # 14425.00 us | } 0) # 14430.98 us | } Cc: AKASHI Takahiro <takahiro.akashi@linaro.org> Cc: Marc Zyngier <marc.zyngier@arm.com> Cc: Rabin Vincent <rabin@rab.in> Cc: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: NJungseok Lee <jungseoklee85@gmail.com> Signed-off-by: NWill Deacon <will.deacon@arm.com>
-
- 12 8月, 2015 2 次提交
-
-
由 Vladimir Murzin 提交于
Since commit 8a14849b (arm64: KVM: Switch vgic save/restore to alternative_insn) vgic_sr_vectors is not used anymore, so remove remaining leftovers and kill the structure. Signed-off-by: NVladimir Murzin <vladimir.murzin@arm.com> Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
-
由 Suzuki K. Poulose 提交于
This patch adds a generic ARM v8 KVM target cpu type for use by the new CPUs which eventualy ends up using the common sys_reg table. For backward compatibility the existing targets have been preserved. Any new target CPU that can be covered by generic v8 sys_reg tables should make use of the new generic target. Signed-off-by: NSuzuki K. Poulose <suzuki.poulose@arm.com> Acked-by: NMarc Zyngier <Marc.Zyngier@arm.com> Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
-
- 05 8月, 2015 1 次提交
-
-
由 Mark Rutland 提交于
The ll/sc __cmpxchg_case_##name assembly mostly uses symbolic names for operands, but in a single case uses %2 to refer to what is otherwise known as %[v]. This makes the code more painful to read than is necessary. Use %[v] instead. Signed-off-by: NMark Rutland <mark.rutland@arm.com> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: NWill Deacon <will.deacon@arm.com>
-
- 03 8月, 2015 3 次提交
-
-
由 Mark Rutland 提交于
To enable sharing with arm, move the core PSCI framework code to drivers/firmware. This results in a minor gain in lines of code, but this will quickly be amortised by the removal of code currently duplicated in arch/arm. Signed-off-by: NMark Rutland <mark.rutland@arm.com> Acked-by: NCatalin Marinas <catalin.marinas@arm.com> Reviewed-by: NHanjun Guo <hanjun.guo@linaro.org> Tested-by: NHanjun Guo <hanjun.guo@linaro.org> Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: NWill Deacon <will.deacon@arm.com>
-
由 Peter Zijlstra 提交于
There are various problems and short-comings with the current static_key interface: - static_key_{true,false}() read like a branch depending on the key value, instead of the actual likely/unlikely branch depending on init value. - static_key_{true,false}() are, as stated above, tied to the static_key init values STATIC_KEY_INIT_{TRUE,FALSE}. - we're limited to the 2 (out of 4) possible options that compile to a default NOP because that's what our arch_static_branch() assembly emits. So provide a new static_key interface: DEFINE_STATIC_KEY_TRUE(name); DEFINE_STATIC_KEY_FALSE(name); Which define a key of different types with an initial true/false value. Then allow: static_branch_likely() static_branch_unlikely() to take a key of either type and emit the right instruction for the case. This means adding a second arch_static_branch_jump() assembly helper which emits a JMP per default. In order to determine the right instruction for the right state, encode the branch type in the LSB of jump_entry::key. This is the final step in removing the naming confusion that has led to a stream of avoidable bugs such as: a833581e ("x86, perf: Fix static_key bug in load_mm_cr4()") ... but it also allows new static key combinations that will give us performance enhancements in the subsequent patches. Tested-by: Rabin Vincent <rabin@rab.in> # arm Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Michael Ellerman <mpe@ellerman.id.au> # ppc Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com> # s390 Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Andrey Konovalov 提交于
Replace ACCESS_ONCE() macro in smp_store_release() and smp_load_acquire() with WRITE_ONCE() and READ_ONCE() on x86, arm, arm64, ia64, metag, mips, powerpc, s390, sparc and asm-generic since ACCESS_ONCE() does not work reliably on non-scalar types. WRITE_ONCE() and READ_ONCE() were introduced in the following commits: 230fa253 ("kernel: Provide READ_ONCE and ASSIGN_ONCE") 43239cbe ("kernel: Change ASSIGN_ONCE(val, x) to WRITE_ONCE(x, val)") Signed-off-by: NAndrey Konovalov <andreyknvl@google.com> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Acked-by: NDavidlohr Bueso <dbueso@suse.de> Acked-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc) Acked-by: NRalf Baechle <ralf@linux-mips.org> Cc: Alexander Duyck <alexander.h.duyck@redhat.com> Cc: Andre Przywara <andre.przywara@arm.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Borislav Petkov <bp@suse.de> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Cc: David S. Miller <davem@davemloft.net> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: James Hogan <james.hogan@imgtec.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Russell King <linux@arm.linux.org.uk> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tony Luck <tony.luck@intel.com> Cc: Will Deacon <will.deacon@arm.com> Cc: linux-arch@vger.kernel.org Link: http://lkml.kernel.org/r/1438528264-714-1-git-send-email-andreyknvl@google.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
- 31 7月, 2015 2 次提交
-
-
由 Will Deacon 提交于
When performing a cmpxchg operation on a signed sub-word type (e.g. s8), we need to ensure that the upper register bits of the "old" value used for comparison are zeroed, otherwise we may erroneously fail the cmpxchg which may even be interpreted as success by the caller (if the compiler performs the truncation as part of its check). This has been observed in mod_state, where negative values where causing problems with this_cpu_cmpxchg. This patch fixes the issue by explicitly casting 8-bit and 16-bit "old" values using unsigned types in our cmpxchg wrappers. 32-bit types can be left alone, since the underlying asm makes use of W registers in this case. Reported-by: NMark Rutland <mark.rutland@arm.com> Signed-off-by: NWill Deacon <will.deacon@arm.com>
-
由 Will Deacon 提交于
When patching the kernel text with alternatives, we may end up patching parts of the stop_machine state machine (e.g. atomic_dec_and_test in ack_state) and consequently corrupt the instruction stream of any secondary CPUs. This patch passes the cpu_online_mask to stop_machine, forcing all of the CPUs into our own callback which can place the secondary cores into a dumb (but safe!) polling loop whilst the patching is carried out. Signed-off-by: NWill Deacon <will.deacon@arm.com>
-
- 30 7月, 2015 2 次提交
-
-
由 Will Deacon 提交于
For some reason, the ll/sc cmpxchg asm is all off to the left and awkward to read in conjunction with the following (correctly indented) LSE version. This patch shifts the ll/sc code back to where it should be. Signed-off-by: NWill Deacon <will.deacon@arm.com>
-
由 Jonas Rabenstein 提交于
Commit 4b3dc967 ("arm64: force CONFIG_SMP=y and remove redundant and therfore can not be selected anymore. Remove dead #ifdef-block depending on UP_LATE_INIT in arch/arm64/kernel/setup.c Signed-off-by: NJonas Rabenstein <jonas.rabenstein@studium.uni-erlangen.de> [will: kill do_post_cpus_up_work altogether] Signed-off-by: NWill Deacon <will.deacon@arm.com>
-
- 28 7月, 2015 5 次提交
-
-
由 Will Deacon 提交于
pte_valid should check if the PTE_VALID bit (1 << 0) is set in the pte, so fix the macro definition to use bitwise & instead of logical &&. Signed-off-by: NWill Deacon <will.deacon@arm.com>
-
由 Will Deacon 提交于
When unlocking a spinlock, we perform a read-modify-write on the owner ticket in order to increment it and store it back with release semantics. In the LL/SC case, we load the 16-bit ticket using a 32-bit load and therefore store back the wrong halfword on a big-endian system, corrupting the lock after the first unlock and killing the system dead. This patch fixes the unlock code to use 16-bit accessors consistently. Signed-off-by: NWill Deacon <will.deacon@arm.com>
-
由 Catalin Marinas 提交于
The flush_tlb_page() function is used on user address ranges when PTEs (or PMDs/PUDs for huge pages) were changed (attributes or clearing). For such cases, it is more efficient to invalidate only the last level of the TLB with the "tlbi vale1is" instruction. In the TLB shoot-down case, the TLB caching of the intermediate page table levels (pmd, pud, pgd) is handled by __flush_tlb_pgtable() via the __(pte|pmd|pud)_free_tlb() functions and it is not deferred to tlb_finish_mmu() (as of commit 285994a6 - "arm64: Invalidate the TLB corresponding to intermediate page table levels"). The tlb_flush() function only needs to invalidate the TLB for the last level of page tables; the __flush_tlb_range() function gains a fourth argument for last level TLBI. Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: NWill Deacon <will.deacon@arm.com>
-
由 Catalin Marinas 提交于
This patch moves the MAX_TLB_RANGE check into the flush_tlb(_kernel)_range functions directly to avoid the undescore-prefixed definitions (and for consistency with a subsequent patch). Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: NWill Deacon <will.deacon@arm.com>
-
由 Will Deacon 提交于
lib/list_sort.c defines a 'struct debug_el', where "el" is assumedly a a contraction of "element". This conflicts with 'enum debug_el' in our asm/debug-monitors.h header file, where "el" stands for Exception Level. The result is build failure when targetting allmodconfig, so rename our enum to 'dbg_active_el' to be slightly more explicit about what it is. Signed-off-by: NWill Deacon <will.deacon@arm.com>
-
- 27 7月, 2015 13 次提交
-
-
由 Will Deacon 提交于
Other CPU features follow an 'ARM64_HAS_*' naming scheme, so do the same for the LSE atomics. Signed-off-by: NWill Deacon <will.deacon@arm.com>
-
由 Will Deacon 提交于
If we attempt to atomic64_dec_if_positive on INT_MIN, we will underflow and incorrectly decide that the original parameter was positive. This patches fixes the broken condition code so that we handle this corner case correctly. Reviewed-by: NSteve Capper <steve.capper@arm.com> Acked-by: NCatalin Marinas <catalin.marinas@arm.com> Signed-off-by: NWill Deacon <will.deacon@arm.com>
-
由 Will Deacon 提交于
We don't need duplicate cmpxchg implementations, so use cmpxchg to implement atomic{,64}_cmpxchg, like we do for xchg already. Reviewed-by: NSteve Capper <steve.capper@arm.com> Reviewed-by: NCatalin Marinas <catalin.marinas@arm.com> Signed-off-by: NWill Deacon <will.deacon@arm.com>
-
由 Will Deacon 提交于
The cost of changing a cacheline from shared to exclusive state can be significant, especially when this is triggered by an exclusive store, since it may result in having to retry the transaction. This patch makes use of prfm to prefetch cachelines for write prior to ldxr/stxr loops when using the ll/sc atomic routines. Reviewed-by: NCatalin Marinas <catalin.marinas@arm.com> Signed-off-by: NWill Deacon <will.deacon@arm.com>
-
由 Will Deacon 提交于
The common (i.e. identical for ll/sc and lse) atomic macros in atomic.h are needlessley different for atomic_t and atomic64_t. This patch tidies up the definitions to make them consistent across the two atomic types and factors out common code such as the add_unless implementation based on cmpxchg. Reviewed-by: NSteve Capper <steve.capper@arm.com> Reviewed-by: NCatalin Marinas <catalin.marinas@arm.com> Signed-off-by: NWill Deacon <will.deacon@arm.com>
-
由 Will Deacon 提交于
cmpxchg doesn't require memory barrier semantics when the value comparison fails, so make the barrier conditional on success. Reviewed-by: NSteve Capper <steve.capper@arm.com> Reviewed-by: NCatalin Marinas <catalin.marinas@arm.com> Signed-off-by: NWill Deacon <will.deacon@arm.com>
-
由 Will Deacon 提交于
We can perform the cmpxchg comparison using eor and cbnz which avoids the "cc" clobber for the ll/sc case and consequently for the LSE case where we may have to fall-back on the ll/sc code at runtime. Reviewed-by: NSteve Capper <steve.capper@arm.com> Reviewed-by: NCatalin Marinas <catalin.marinas@arm.com> Signed-off-by: NWill Deacon <will.deacon@arm.com>
-
由 Will Deacon 提交于
On CPUs which support the LSE atomic instructions introduced in ARMv8.1, it makes sense to use them in preference to ll/sc sequences. This patch introduces runtime patching of our cmpxchg_double primitives so that the LSE casp instruction is used instead. Reviewed-by: NCatalin Marinas <catalin.marinas@arm.com> Signed-off-by: NWill Deacon <will.deacon@arm.com>
-
由 Will Deacon 提交于
On CPUs which support the LSE atomic instructions introduced in ARMv8.1, it makes sense to use them in preference to ll/sc sequences. This patch introduces runtime patching of our cmpxchg primitives so that the LSE cas instruction is used instead. Reviewed-by: NCatalin Marinas <catalin.marinas@arm.com> Signed-off-by: NWill Deacon <will.deacon@arm.com>
-
由 Will Deacon 提交于
On CPUs which support the LSE atomic instructions introduced in ARMv8.1, it makes sense to use them in preference to ll/sc sequences. This patch introduces runtime patching of our xchg primitives so that the LSE swp instruction (yes, you read right!) is used instead. Reviewed-by: NSteve Capper <steve.capper@arm.com> Reviewed-by: NCatalin Marinas <catalin.marinas@arm.com> Signed-off-by: NWill Deacon <will.deacon@arm.com>
-
由 Will Deacon 提交于
On CPUs which support the LSE atomic instructions introduced in ARMv8.1, it makes sense to use them in preference to ll/sc sequences. This patch introduces runtime patching of our bitops functions so that LSE atomic instructions are used instead. Reviewed-by: NSteve Capper <steve.capper@arm.com> Reviewed-by: NCatalin Marinas <catalin.marinas@arm.com> Signed-off-by: NWill Deacon <will.deacon@arm.com>
-
由 Will Deacon 提交于
On CPUs which support the LSE atomic instructions introduced in ARMv8.1, it makes sense to use them in preference to ll/sc sequences. This patch introduces runtime patching of our locking functions so that LSE atomic instructions are used for spinlocks and rwlocks. Reviewed-by: NCatalin Marinas <catalin.marinas@arm.com> Signed-off-by: NWill Deacon <will.deacon@arm.com>
-
由 Will Deacon 提交于
On CPUs which support the LSE atomic instructions introduced in ARMv8.1, it makes sense to use them in preference to ll/sc sequences. This patch introduces runtime patching of atomic_t and atomic64_t routines so that the call-site for the out-of-line ll/sc sequences is patched with an LSE atomic instruction when we detect that the CPU supports it. If binutils is not recent enough to assemble the LSE instructions, then the ll/sc sequences are inlined as though CONFIG_ARM64_LSE_ATOMICS=n. Reviewed-by: NCatalin Marinas <catalin.marinas@arm.com> Signed-off-by: NWill Deacon <will.deacon@arm.com>
-