提交 · 005f78cd88494457ed38ce817f4e3fe5d372f0cb · openanolis / cloud-kernel

08 5月, 2014 2 次提交

arm64: defer reloading a task's FPSIMD state to userland resume · 005f78cd

由 Ard Biesheuvel 提交于 5月 08, 2014

If a task gets scheduled out and back in again and nothing has touched
its FPSIMD state in the mean time, there is really no reason to reload
it from memory. Similarly, repeated calls to kernel_neon_begin() and
kernel_neon_end() will preserve and restore the FPSIMD state every time.

This patch defers the FPSIMD state restore to the last possible moment,
i.e., right before the task returns to userland. If a task does not return to
userland at all (for any reason), the existing FPSIMD state is preserved
and may be reused by the owning task if it gets scheduled in again on the
same CPU.

This patch adds two more functions to abstract away from straight FPSIMD
register file saves and restores:
- fpsimd_restore_current_state -> ensure current's FPSIMD state is loaded
- fpsimd_flush_task_state -> invalidate live copies of a task's FPSIMD state
Signed-off-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>

005f78cd

arm64: add abstractions for FPSIMD state manipulation · c51f9269

由 Ard Biesheuvel 提交于 2月 24, 2014

There are two tacit assumptions in the FPSIMD handling code that will no longer
hold after the next patch that optimizes away some FPSIMD state restores:
. the FPSIMD registers of this CPU contain the userland FPSIMD state of
  task 'current';
. when switching to a task, its FPSIMD state will always be restored from
  memory.

This patch adds the following functions to abstract away from straight FPSIMD
register file saves and restores:
- fpsimd_preserve_current_state -> ensure current's FPSIMD state is saved
- fpsimd_update_current_state -> replace current's FPSIMD state

Where necessary, the signal handling and fork code are updated to use the above
wrappers instead of poking into the FPSIMD registers directly.
Signed-off-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>

c51f9269

08 4月, 2014 2 次提交

arm64: add early_ioremap support · bf4b558e

由 Mark Salter 提交于 4月 07, 2014

Add support for early IO or memory mappings which are needed before the
normal ioremap() is usable.  This also adds fixmap support for permanent
fixed mappings such as that used by the earlyprintk device register
region.
Signed-off-by: NMark Salter <msalter@redhat.com>
Acked-by: NCatalin Marinas <catalin.marinas@arm.com>
Cc: Borislav Petkov <borislav.petkov@amd.com>
Cc: Dave Young <dyoung@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

bf4b558e

arm64: initialize pgprot info earlier in boot · 0bf757c7

由 Mark Salter 提交于 4月 07, 2014

Presently, paging_init() calls init_mem_pgprot() to initialize pgprot
values used by macros such as PAGE_KERNEL, PAGE_KERNEL_EXEC, etc.

The new fixmap and early_ioremap support also needs to use these macros
before paging_init() is called.  This patch moves the init_mem_pgprot()
call out of paging_init() and into setup_arch() so that pgprot_default
gets initialized in time for fixmap and early_ioremap.
Signed-off-by: NMark Salter <msalter@redhat.com>
Acked-by: NCatalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Borislav Petkov <borislav.petkov@amd.com>
Cc: Dave Young <dyoung@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

0bf757c7

05 4月, 2014 1 次提交

Revert "arm64: virt: ensure visibility of __boot_cpu_mode" · 0a997ecc

由 Catalin Marinas 提交于 3月 28, 2014

This reverts commit 82b2f495. The
__boot_cpu_mode variable is flushed in head.S after being written,
therefore the additional cache flushing is no longer required.
Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>

0a997ecc

03 4月, 2014 1 次提交

arm64: Update the TCR_EL1 translation granule definitions for 16K pages · 35a86976

由 Catalin Marinas 提交于 4月 02, 2014

The current TCR register setting in arch/arm64/mm/proc.S assumes that
TCR_EL1.TG* fields are one bit wide and bit 31 is RES1 (reserved, set to
1). With the addition of 16K pages (currently unsupported in the
kernel), the TCR_EL1.TG* fields have been extended to two bits. This
patch updates the corresponding Linux definitions and drops the bit 31
setting in proc.S in favour of the new macros.
Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
Reported-by: NJoe Sylve <joe.sylve@gmail.com>

35a86976

24 3月, 2014 1 次提交

arm64: Remove pgprot_dmacoherent() · 196adf2f

由 Catalin Marinas 提交于 3月 24, 2014

Since this macro is identical to pgprot_writecombine() and is only used
in a single place, remove it completely to avoid confusion. On ARMv7+
processors, the coherent DMA mapping must be Normal NonCacheable (a.k.a.
writecombine) to avoid mismatched hardware attribute aliases (with the
kernel linear mapping as Normal Cacheable).
Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>

196adf2f

21 3月, 2014 1 次提交

arm64: Fix __range_ok macro · 31b1e940

由 Christopher Covington 提交于 3月 19, 2014

Without this, the following scenario is incorrectly determined
to be invalid.

addr 0x7f_ffffe000 size 8192 addr_limit 0x80_00000000

This behavior was observed while trying to vmsplice the stack
as part of a CRIU dump of a process on a system started with the
norandmaps kernel parameter.
Signed-off-by: NChristopher Covington <cov@codeaurora.org>
Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>

31b1e940

15 3月, 2014 3 次提交

arm64: mm: Route pmd thp functions through pte equivalents · 9c7e535f

由 Steve Capper 提交于 2月 25, 2014

Rather than have separate hugetlb and transparent huge page pmd
manipulation functions, re-wire our thp functions to simply call the
pte equivalents.

This allows THP to take advantage of the new PTE_WRITE logic introduced
in:
  c2c93e5b arm64: mm: Introduce PTE_WRITE

To represent splitting THPs we use the PTE_SPECIAL bit as this is not
used for pmds.
Signed-off-by: NSteve Capper <steve.capper@linaro.org>
Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>

9c7e535f

arm64: rwsem: use asm-generic rwsem implementation · c209f799

由 Will Deacon 提交于 3月 14, 2014

asm-generic offers an atomic-add based rwsem implementation, which
can avoid the need for heavier, spinlock-based synchronisation on the
fast path.

This patch makes use of the optimised implementation for arm64 CPUs.
Signed-off-by: NWill Deacon <will.deacon@arm.com>
Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>

c209f799

arm64: enable generic CPU feature modalias matching for this architecture · 3be1a5c4

由 Ard Biesheuvel 提交于 3月 04, 2014

This enables support for the generic CPU feature modalias implementation that
wires up optional CPU features to udev based module autoprobing.

A file <asm/cpufeature.h> is provided that maps CPU feature numbers to
elf_hwcap bits, which is the standard way on arm64 to advertise optional CPU
features both internally and to user space.
Signed-off-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
[catalin.marinas@arm.com: removed unnecessary "!!"]
Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>

3be1a5c4

13 3月, 2014 5 次提交

ARM64: perf: support dwarf unwinding in compat mode · 5f888a1d

由 Jean Pihet 提交于 2月 03, 2014

Add support for unwinding using the dwarf information in compat
mode. Using the correct user stack pointer allows perf to record
the frames correctly in the native and compat modes.

Note that although the dwarf frame unwinding works ok using
libunwind in native mode (on ARMv7 & ARMv8), some changes are
required to the libunwind code for the compat mode. Those changes
are posted separately on the libunwind mailing list.

Tested on ARMv8 platform with v8 and compat v7 binaries, the latter
are statically built.
Signed-off-by: NJean Pihet <jean.pihet@linaro.org>
Acked-by: NWill Deacon <will.deacon@arm.com>
Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>

5f888a1d

ARM64: perf: add support for perf registers API · 2ee0d7fd

由 Jean Pihet 提交于 2月 03, 2014

This patch implements the functions required for the perf registers API,
allowing the perf tool to interface kernel register dumps with libunwind
in order to provide userspace backtracing.
Compat mode is also supported.

Only the general purpose user space registers are exported, i.e.:
 PERF_REG_ARM_X0,
 ...
 PERF_REG_ARM_X28,
 PERF_REG_ARM_FP,
 PERF_REG_ARM_LR,
 PERF_REG_ARM_SP,
 PERF_REG_ARM_PC
and not the PERF_REG_ARM_V* registers.
Signed-off-by: NJean Pihet <jean.pihet@linaro.org>
Acked-by: NWill Deacon <will.deacon@arm.com>
Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>

2ee0d7fd

arm64: Add boot time configuration of Intermediate Physical Address size · 87366d8c

由 Radha Mohan Chintakuntla 提交于 3月 07, 2014

ARMv8 supports a range of physical address bit sizes. The PARange bits
from ID_AA64MMFR0_EL1 register are read during boot-time and the
intermediate physical address size bits are written in the translation
control registers (TCR_EL1 and VTCR_EL2).

There is no change in the VA bits and levels of translation.
Signed-off-by: NRadha Mohan Chintakuntla <rchintakuntla@cavium.com>
Reviewed-by: NWill Deacon <Will.deacon@arm.com>
Acked-by: NMarc Zyngier <marc.zyngier@arm.com>
Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>

87366d8c

arm64: Do not synchronise I and D caches for special ptes · 71fdb6bf

由 Catalin Marinas 提交于 3月 12, 2014

Special pte mappings are not intended to be executable and do not even
have an associated struct page. This patch ensures that we do not call
__sync_icache_dcache() on such ptes.
Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
Reported-by: NSteve Capper <Steve.Capper@arm.com>
Tested-by: NLaura Abbott <lauraa@codeaurora.org>
Tested-by: NBharat Bhushan <Bharat.Bhushan@freescale.com>
Cc: <stable@vger.kernel.org>

71fdb6bf

arm64: Make DMA coherent and strongly ordered mappings not executable · de2db743

由 Catalin Marinas 提交于 3月 12, 2014

pgprot_{dmacoherent,writecombine,noncached} don't need to generate
executable mappings with side-effects like __sync_icache_dcache() being
called when the mapping is in user space.
Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
Reported-by: NBharat Bhushan <Bharat.Bhushan@freescale.com>
Tested-by: NLaura Abbott <lauraa@codeaurora.org>
Tested-by: NBharat Bhushan <Bharat.Bhushan@freescale.com>
Cc: <stable@vger.kernel.org>

de2db743

10 3月, 2014 1 次提交

arm64: barriers: add dmb barrier · d152d22a

由 Will Deacon 提交于 3月 10, 2014

Commit 8adbf57f ("irqchip: gic: use dmb ishst instead of dsb when
raising a softirq") added an explicit dmb(...) call to the GIC driver.

This patch adds a simple dmb() macro to arm64, which expands to a DMB SY
instruction.
Signed-off-by: NWill Deacon <will.deacon@arm.com>
Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>

d152d22a

04 3月, 2014 4 次提交

arm64: topology: Implement basic CPU topology support · f6e763b9

由 Mark Brown 提交于 3月 04, 2014

Add basic CPU topology support to arm64, based on the existing pre-v8
code and some work done by Mark Hambleton.  This patch does not
implement any topology discovery support since that should be based on
information from firmware, it merely implements the scaffolding for
integration of topology support in the architecture.

No locking of the topology data is done since it is only modified during
CPU bringup with external serialisation from the SMP code.

The goal is to separate the architecture hookup for providing topology
information from the DT parsing in order to ease review and avoid
blocking the architecture code (which will be built on by other work)
with the DT code review by providing something simple and basic.

Following patches will implement support for interpreting topology
information from MPIDR and for parsing the DT topology bindings for ARM,
similar patches will be needed for ACPI.
Signed-off-by: NMark Brown <broonie@linaro.org>
Acked-by: NMark Rutland <mark.rutland@arm.com>
[catalin.marinas@arm.com: removed CONFIG_CPU_TOPOLOGY, always on if SMP]
Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>

f6e763b9

arm64: advertise ARMv8 extensions to 32-bit compat ELF binaries · 4cf761cd

由 Ard Biesheuvel 提交于 3月 03, 2014

This adds support for advertising the presence of ARMv8 Crypto
Extensions in the Aarch32 execution state to 32-bit ELF binaries
running in 32-bit compat mode under the arm64 kernel.
Signed-off-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>

4cf761cd

arm64: add AT_HWCAP2 support for 32-bit compat · 28964d32

由 Ard Biesheuvel 提交于 3月 03, 2014

Add support for the ELF auxv entry AT_HWCAP2 when running 32-bit
ELF binaries in compat mode.
Signed-off-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>

28964d32

compat: let architectures define __ARCH_WANT_COMPAT_SYS_GETDENTS64 · 0473c9b5

由 Heiko Carstens 提交于 3月 03, 2014

For architecture dependent compat syscalls in common code an architecture
must define something like __ARCH_WANT_<WHATEVER> if it wants to use the
code.
This however is not true for compat_sys_getdents64 for which architectures
must define __ARCH_OMIT_COMPAT_SYS_GETDENTS64 if they do not want the code.

This leads to the situation where all architectures, except mips, get the
compat code but only x86_64, arm64 and the generic syscall architectures
actually use it.

So invert the logic, so that architectures actively must do something to
get the compat code.

This way a couple of architectures get rid of otherwise dead code.
Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>

0473c9b5

03 3月, 2014 4 次提交

arm64: KVM: flush VM pages before letting the guest enable caches · 9d218a1f

由 Marc Zyngier 提交于 1月 15, 2014

When the guest runs with caches disabled (like in an early boot
sequence, for example), all the writes are diectly going to RAM,
bypassing the caches altogether.

Once the MMU and caches are enabled, whatever sits in the cache
becomes suddenly visible, which isn't what the guest expects.

A way to avoid this potential disaster is to invalidate the cache
when the MMU is being turned on. For this, we hook into the SCTLR_EL1
trapping code, and scan the stage-2 page tables, invalidating the
pages/sections that have already been mapped in.
Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
Reviewed-by: NCatalin Marinas <catalin.marinas@arm.com>
Reviewed-by: NChristoffer Dall <christoffer.dall@linaro.org>

9d218a1f

ARM: KVM: introduce kvm_p*d_addr_end · a3c8bd31

由 Marc Zyngier 提交于 2月 18, 2014

The use of p*d_addr_end with stage-2 translation is slightly dodgy,
as the IPA is 40bits, while all the p*d_addr_end helpers are
taking an unsigned long (arm64 is fine with that as unligned long
is 64bit).

The fix is to introduce 64bit clean versions of the same helpers,
and use them in the stage-2 page table code.
Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
Acked-by: NCatalin Marinas <catalin.marinas@arm.com>
Reviewed-by: NChristoffer Dall <christoffer.dall@linaro.org>

a3c8bd31

arm64: KVM: trap VM system registers until MMU and caches are ON · 4d44923b

由 Marc Zyngier 提交于 1月 14, 2014

In order to be able to detect the point where the guest enables
its MMU and caches, trap all the VM related system registers.

Once we see the guest enabling both the MMU and the caches, we
can go back to a saner mode of operation, which is to leave these
registers in complete control of the guest.
Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
Reviewed-by: NCatalin Marinas <catalin.marinas@arm.com>
Reviewed-by: NChristoffer Dall <christoffer.dall@linaro.org>

4d44923b

arm64: KVM: force cache clean on page fault when caches are off · 2d58b733

由 Marc Zyngier 提交于 1月 14, 2014

In order for the guest with caches off to observe data written
contained in a given page, we need to make sure that page is
committed to memory, and not just hanging in the cache (as
guest accesses are completely bypassing the cache until it
decides to enable it).

For this purpose, hook into the coherent_icache_guest_page
function and flush the region if the guest SCTLR_EL1
register doesn't show the MMU  and caches as being enabled.
The function also get renamed to coherent_cache_guest_page.
Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
Reviewed-by: NCatalin Marinas <catalin.marinas@arm.com>
Reviewed-by: NChristoffer Dall <christoffer.dall@linaro.org>

2d58b733

01 3月, 2014 1 次提交

arm64: Fix !CONFIG_SMP kernel build · b57fc9e8

由 Catalin Marinas 提交于 2月 28, 2014

Commit fb4a9602 (arm64: kernel: fix per-cpu offset restore on
resume) uses per_cpu_offset() unconditionally during CPU wakeup,
however, this is only defined for the SMP case.
Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
Reported-by: NDave P Martin <Dave.Martin@arm.com>

b57fc9e8

28 2月, 2014 3 次提交

arm64: mm: Add double logical invert to pte accessors · 84fe6826

由 Steve Capper 提交于 2月 25, 2014

Page table entries on ARM64 are 64 bits, and some pte functions such as
pte_dirty return a bitwise-and of a flag with the pte value. If the
flag to be tested resides in the upper 32 bits of the pte, then we run
into the danger of the result being dropped if downcast.

For example:
	gather_stats(page, md, pte_dirty(*pte), 1);
where pte_dirty(*pte) is downcast to an int.

This patch adds a double logical invert to all the pte_ accessors to
ensure predictable downcasting.
Signed-off-by: NSteve Capper <steve.capper@linaro.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>

84fe6826

arm64: remove return value form psci_init() · 64b4f60f

由 Vladimir Murzin 提交于 2月 28, 2014

psci_init() is written to return err code if something goes wrong. However,
the single user, setup_arch(), doesn't care about it. Moreover, every error
path is supplied with a clear message which is enough for pleasant debugging.
Signed-off-by: NVladimir Murzin <vladimir.murzin@arm.com>
Acked-by: NMark Rutland <mark.rutland@arm.com>
Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>

64b4f60f

arm64: Implement coherent DMA API based on swiotlb · 7363590d

由 Catalin Marinas 提交于 5月 21, 2013

This patch adds support for DMA API cache maintenance on SoCs without
hardware device cache coherency.
Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>

7363590d

26 2月, 2014 5 次提交

arm64: Convert asm/tlb.h to generic mmu_gather · 020c1427

由 Catalin Marinas 提交于 2月 11, 2014

Over the past couple of years, the generic mmu_gather gained range
tracking - 597e1c35 (mm/mmu_gather: enable tlb flush range in generic
mmu_gather), 2b047252 (Fix TLB gather virtual address range
invalidation corner cases) - and tlb_fast_mode() has been removed -
29eb7782 (arch, mm: Remove tlb_fast_mode()).

The new mmu_gather structure is now suitable for arm64 and this patch
converts the arch asm/tlb.h to the generic code. One functional
difference is the shift_arg_pages() case where previously the code was
flushing the full mm (no tlb_start_vma call) but now it flushes the
range given to tlb_gather_mmu() (possibly slightly more efficient
previously).
Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>

020c1427

arm64: Extend the PCI I/O space to 16MB · 22bd1c91

由 Catalin Marinas 提交于 2月 04, 2014

The patch moves the PCI I/O space (currently at 64K) before the
earlyprintk mapping and extends it to 16MB.
Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>

22bd1c91

misc: debug: remove compilation warnings · 58dcc204

由 Vijaya Kumar K 提交于 1月 28, 2014

typecast instruction_pointer macro to unsigned long to
resolve following compiler warnings like
warning: format '%lx' expects argument of type 'long unsigned int',
but argument 2 has type 'u64' [-Wformat]
Signed-off-by: NVijaya Kumar K <Vijaya.Kumar@caviumnetworks.com>
Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>

58dcc204

arm64: KGDB: Add Basic KGDB support · bcf5763b

由 Vijaya Kumar K 提交于 1月 28, 2014

Add KGDB debug support for kernel debugging.
With this patch, basic KGDB debugging is possible.GDB register
layout is updated and GDB tool can establish connection with
target and can set/clear breakpoints.
Signed-off-by: NVijaya Kumar K <Vijaya.Kumar@caviumnetworks.com>
Reviewed-by: NWill Deacon <will.deacon@arm.com>
Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>

bcf5763b

arm64: Add macros to manage processor debug state · c7db4ff5

由 Vijaya Kumar K 提交于 1月 28, 2014

Add macros to enable and disable to manage PSTATE.D
for debugging. The macros local_dbg_save and local_dbg_restore
are moved to irqflags.h file

KGDB boot tests fail because of PSTATE.D is masked.
unmask it for debugging support
Signed-off-by: NVijaya Kumar K <Vijaya.Kumar@caviumnetworks.com>
Acked-by: NWill Deacon <will.deacon@arm.com>
Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>

c7db4ff5

10 2月, 2014 2 次提交

locking/mcs: Allow architecture specific asm files to be used for contended case · ddf1d169

由 Tim Chen 提交于 1月 21, 2014

This patch allows each architecture to add its specific assembly optimized
arch_mcs_spin_lock_contended and arch_mcs_spinlock_uncontended for
MCS lock and unlock functions.
Signed-off-by: NTim Chen <tim.c.chen@linux.intel.com>
Cc: Scott J Norton <scott.norton@hp.com>
Cc: Raghavendra K T <raghavendra.kt@linux.vnet.ibm.com>
Cc: AswinChandramouleeswaran <aswin@hp.com>
Cc: George Spelvin <linux@horizon.com>
Cc: Rik vanRiel <riel@redhat.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: MichelLespinasse <walken@google.com>
Cc: Peter Hurley <peter@hurleysoftware.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Alex Shi <alex.shi@linaro.org>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: "Figo.zhang" <figo1802@gmail.com>
Cc: "Paul E.McKenney" <paulmck@linux.vnet.ibm.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Davidlohr Bueso <davidlohr.bueso@hp.com>
Cc: Waiman Long <waiman.long@hp.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Matthew R Wilcox <matthew.r.wilcox@intel.com>
Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1390347382.3138.67.camel@schen9-DESKSigned-off-by: NIngo Molnar <mingo@kernel.org>

ddf1d169

locking/mcs: Order the header files in Kbuild of each architecture in alphabetical order · b119fa61

由 Tim Chen 提交于 1月 21, 2014

We perform a clean up of the Kbuid files in each architecture.
We order the files in each Kbuild in alphabetical order
by running the below script.

for i in arch/*/include/asm/Kbuild
do
        cat $i | gawk '/^generic-y/ {
                i = 3;
                do {
                        for (; i <= NF; i++) {
                                if ($i == "\\") {
                                        getline;
                                        i = 1;
                                        continue;
                                }
                                if ($i != "")
                                        hdr[$i] = $i;
                        }
                        break;
                } while (1);
                next;
        }
        // {
                print $0;
        }
        END {
                n = asort(hdr);
                for (i = 1; i <= n; i++)
                        print "generic-y += " hdr[i];
        }' > ${i}.sorted;
        mv ${i}.sorted $i;
done
Signed-off-by: NTim Chen <tim.c.chen@linux.intel.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Matthew R Wilcox <matthew.r.wilcox@intel.com>
Cc: AswinChandramouleeswaran <aswin@hp.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: "Paul E.McKenney" <paulmck@linux.vnet.ibm.com>
Cc: Scott J Norton <scott.norton@hp.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: "Figo.zhang" <figo1802@gmail.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Waiman Long <waiman.long@hp.com>
Cc: Peter Hurley <peter@hurleysoftware.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Alex Shi <alex.shi@linaro.org>
Cc: Raghavendra K T <raghavendra.kt@linux.vnet.ibm.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: George Spelvin <linux@horizon.com>
Cc: MichelLespinasse <walken@google.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Davidlohr Bueso <davidlohr.bueso@hp.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
[ Fixed build bug. ]
Signed-off-by: NIngo Molnar <mingo@kernel.org>

b119fa61

08 2月, 2014 2 次提交

arm64: asm: remove redundant "cc" clobbers · 95c41896

由 Will Deacon 提交于 2月 04, 2014

cbnz/tbnz don't update the condition flags, so remove the "cc" clobbers
from inline asm blocks that only use these instructions to implement
conditional branches.
Signed-off-by: NWill Deacon <will.deacon@arm.com>
Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>

95c41896

arm64: atomics: fix use of acquire + release for full barrier semantics · 8e86f0b4

由 Will Deacon 提交于 2月 04, 2014

Linux requires a number of atomic operations to provide full barrier
semantics, that is no memory accesses after the operation can be
observed before any accesses up to and including the operation in
program order.

On arm64, these operations have been incorrectly implemented as follows:

	// A, B, C are independent memory locations

	<Access [A]>

	// atomic_op (B)
1:	ldaxr	x0, [B]		// Exclusive load with acquire
	<op(B)>
	stlxr	w1, x0, [B]	// Exclusive store with release
	cbnz	w1, 1b

	<Access [C]>

The assumption here being that two half barriers are equivalent to a
full barrier, so the only permitted ordering would be A -> B -> C
(where B is the atomic operation involving both a load and a store).

Unfortunately, this is not the case by the letter of the architecture
and, in fact, the accesses to A and C are permitted to pass their
nearest half barrier resulting in orderings such as Bl -> A -> C -> Bs
or Bl -> C -> A -> Bs (where Bl is the load-acquire on B and Bs is the
store-release on B). This is a clear violation of the full barrier
requirement.

The simple way to fix this is to implement the same algorithm as ARMv7
using explicit barriers:

	<Access [A]>

	// atomic_op (B)
	dmb	ish		// Full barrier
1:	ldxr	x0, [B]		// Exclusive load
	<op(B)>
	stxr	w1, x0, [B]	// Exclusive store
	cbnz	w1, 1b
	dmb	ish		// Full barrier

	<Access [C]>

but this has the undesirable effect of introducing *two* full barrier
instructions. A better approach is actually the following, non-intuitive
sequence:

	<Access [A]>

	// atomic_op (B)
1:	ldxr	x0, [B]		// Exclusive load
	<op(B)>
	stlxr	w1, x0, [B]	// Exclusive store with release
	cbnz	w1, 1b
	dmb	ish		// Full barrier

	<Access [C]>

The simple observations here are:

  - The dmb ensures that no subsequent accesses (e.g. the access to C)
    can enter or pass the atomic sequence.

  - The dmb also ensures that no prior accesses (e.g. the access to A)
    can pass the atomic sequence.

  - Therefore, no prior access can pass a subsequent access, or
    vice-versa (i.e. A is strictly ordered before C).

  - The stlxr ensures that no prior access can pass the store component
    of the atomic operation.

The only tricky part remaining is the ordering between the ldxr and the
access to A, since the absence of the first dmb means that we're now
permitting re-ordering between the ldxr and any prior accesses.

From an (arbitrary) observer's point of view, there are two scenarios:

  1. We have observed the ldxr. This means that if we perform a store to
     [B], the ldxr will still return older data. If we can observe the
     ldxr, then we can potentially observe the permitted re-ordering
     with the access to A, which is clearly an issue when compared to
     the dmb variant of the code. Thankfully, the exclusive monitor will
     save us here since it will be cleared as a result of the store and
     the ldxr will retry. Notice that any use of a later memory
     observation to imply observation of the ldxr will also imply
     observation of the access to A, since the stlxr/dmb ensure strict
     ordering.

  2. We have not observed the ldxr. This means we can perform a store
     and influence the later ldxr. However, that doesn't actually tell
     us anything about the access to [A], so we've not lost anything
     here either when compared to the dmb variant.

This patch implements this solution for our barriered atomic operations,
ensuring that we satisfy the full barrier requirements where they are
needed.

Cc: <stable@vger.kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: NWill Deacon <will.deacon@arm.com>
Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>

8e86f0b4

06 2月, 2014 1 次提交

arm64: barriers: allow dsb macro to take option parameter · 4a7ac12e

由 Will Deacon 提交于 2月 06, 2014

The dsb instruction takes an option specifying both the target access
types and shareability domain.

This patch allows such an option to be passed to the dsb macro,
resulting in potentially more efficient code. Currently the option is
ignored until all callers are updated (unlike ARM, the option is
mandated by the assembler).
Signed-off-by: NWill Deacon <will.deacon@arm.com>
Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>

4a7ac12e

05 2月, 2014 1 次提交

arm64: compat: Wire up new AArch32 syscalls · 6290b53d

由 Catalin Marinas 提交于 2月 05, 2014

This patch enables sys_compat, sys_finit_module, sys_sched_setattr and
sys_sched_getattr for compat (AArch32) applications.
Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>

6290b53d

openanolis / cloud-kernel 接近 2 年 前同步成功

openanolis / cloud-kernel
接近 2 年前同步成功