提交 · 7d2b6ef19cf0f98cef17aa5185de3631a618710a · openanolis / cloud-kernel

17 4月, 2015 1 次提交

arm: use asm-generic for seccomp.h · 6bcf4e9a

由 Kees Cook 提交于 4月 16, 2015

Switch to using the newly created asm-generic/seccomp.h for the seccomp
strict mode syscall definitions. Definitions were identical.
Signed-off-by: NKees Cook <keescook@chromium.org>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Laura Abbott <lauraa@codeaurora.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

6bcf4e9a

15 4月, 2015 2 次提交

mm: fold arch_randomize_brk into ARCH_HAS_ELF_RANDOMIZE · 204db6ed

由 Kees Cook 提交于 4月 14, 2015

The arch_randomize_brk() function is used on several architectures,
even those that don't support ET_DYN ASLR. To avoid bulky extern/#define
tricks, consolidate the support under CONFIG_ARCH_HAS_ELF_RANDOMIZE for
the architectures that support it, while still handling CONFIG_COMPAT_BRK.
Signed-off-by: NKees Cook <keescook@chromium.org>
Cc: Hector Marco-Gisbert <hecmargi@upv.es>
Cc: Russell King <linux@arm.linux.org.uk>
Reviewed-by: NIngo Molnar <mingo@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: "David A. Long" <dave.long@linaro.org>
Cc: Andrey Ryabinin <a.ryabinin@samsung.com>
Cc: Arun Chandran <achandran@mvista.com>
Cc: Yann Droneaud <ydroneaud@opteya.com>
Cc: Min-Hua Chen <orca.chen@gmail.com>
Cc: Paul Burton <paul.burton@imgtec.com>
Cc: Alex Smith <alex@alex-smith.me.uk>
Cc: Markos Chandras <markos.chandras@imgtec.com>
Cc: Vineeth Vijayan <vvijayan@mvista.com>
Cc: Jeff Bailey <jeffbailey@google.com>
Cc: Michael Holzheu <holzheu@linux.vnet.ibm.com>
Cc: Ben Hutchings <ben@decadent.org.uk>
Cc: Behan Webster <behanw@converseincode.com>
Cc: Ismael Ripoll <iripoll@upv.es>
Cc: Jan-Simon Mller <dl9pf@gmx.de>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

204db6ed

ARM: allow 16-bit instructions in ALT_UP() · 89c6bc58

由 Russell King 提交于 4月 09, 2015

Allow ALT_UP() to cope with a 16-bit Thumb instruction by automatically
inserting a following nop instruction. This allows us to care less
about getting the assembler to emit a 32-bit thumb instruction.
Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>

89c6bc58

13 4月, 2015 1 次提交

arm: Remove signal translation and exec_domain · a4980448

由 Richard Weinberger 提交于 7月 13, 2014

As execution domain support is gone we can remove
signal translation from the signal code and remove
exec_domain from thread_info.
Signed-off-by: NRichard Weinberger <richard@nod.at>

a4980448

10 4月, 2015 1 次提交

crypto: arm/sha1 - move SHA-1 ARM asm implementation to base layer · 90451d6b

由 Ard Biesheuvel 提交于 4月 09, 2015

This removes all the boilerplate from the existing implementation,
and replaces it with calls into the base layer.
Signed-off-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>

90451d6b

09 4月, 2015 1 次提交

jump_label: Allow asm/jump_label.h to be included in assembly · 55dd0df7

由 Anton Blanchard 提交于 4月 09, 2015

Wrap asm/jump_label.h for all archs with #ifndef __ASSEMBLY__.
Since these are kernel only headers, we don't need #ifdef
__KERNEL__ so can simplify things a bit.

If an architecture wants to use jump labels in assembly, it
will still need to define a macro to create the __jump_table
entries (see ARCH_STATIC_BRANCH in the powerpc asm/jump_label.h
for an example).
Signed-off-by: NAnton Blanchard <anton@samba.org>
Acked-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: benh@kernel.crashing.org
Cc: catalin.marinas@arm.com
Cc: davem@davemloft.net
Cc: heiko.carstens@de.ibm.com
Cc: jbaron@akamai.com
Cc: linux@arm.linux.org.uk
Cc: linuxppc-dev@lists.ozlabs.org
Cc: liuj97@gmail.com
Cc: mgorman@suse.de
Cc: mmarek@suse.cz
Cc: mpe@ellerman.id.au
Cc: paulus@samba.org
Cc: ralf@linux-mips.org
Cc: rostedt@goodmis.org
Cc: schwidefsky@de.ibm.com
Cc: will.deacon@arm.com
Link: http://lkml.kernel.org/r/1428551492-21977-1-git-send-email-anton@samba.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>

55dd0df7

04 4月, 2015 1 次提交

ARM: MCPM: move the algorithmic complexity to the core code · d3a87544

由 Nicolas Pitre 提交于 3月 11, 2015

All backends are reimplementing a variation of the same CPU reference
count handling. They are also responsible for driving the MCPM special
low-level locking. This is needless duplication, involving algorithmic
requirements that are not necessarily obvious to the uninitiated.
And from past code review experience, those were all initially
implemented badly.

After 3 years, it is time to refactor as much common code to the core
MCPM facility to make the backends as simple as possible.  To avoid a
flag day, the new scheme is introduced in parallel to the existing
backend interface.  When all backends are converted over, the
compatibility interface could be removed.

The new MCPM backend interface implements simpler methods addressing
very platform specific tasks performed under lock protection while
keeping the algorithmic complexity and race avoidance local to the
core code.
Signed-off-by: NNicolas Pitre <nico@linaro.org>
Tested-by: NDaniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: NOlof Johansson <olof@lixom.net>

d3a87544

03 4月, 2015 1 次提交

ARM, clocksource/drivers: Provide read_boot_clock64() and read_persistent_clock64() and use them · cb850717

由 Xunlei Pang 提交于 4月 01, 2015

As part of addressing "y2038 problem" for in-kernel uses, this
patch converts read_boot_clock() to read_boot_clock64() and
read_persistent_clock() to read_persistent_clock64() using
timespec64 by converting clock_access_fn to use timespec64.
Signed-off-by: NXunlei Pang <pang.xunlei@linaro.org>
Signed-off-by: NJohn Stultz <john.stultz@linaro.org>
Acked-by: Thierry Reding <treding@nvidia.com> (for tegra part)
Cc: Russell King <rmk@dyn-67.arm.linux.org.uk>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1427945681-29972-7-git-send-email-john.stultz@linaro.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>

cb850717

02 4月, 2015 1 次提交

ARM: 8338/1: kexec: Relax SMP validation to improve DT compatibility · fee3fd4f

由 Geert Uytterhoeven 提交于 4月 01, 2015

When trying to kexec into a new kernel on a platform where multiple CPU
cores are present, but no SMP bringup code is available yet, the
kexec_load system call fails with:

    kexec_load failed: Invalid argument

The SMP test added to machine_kexec_prepare() in commit 2103f6cb
("ARM: 7807/1: kexec: validate CPU hotplug support") wants to prohibit
kexec on SMP platforms where it cannot disable secondary CPUs.
However, this test is too strict: if the secondary CPUs couldn't be
enabled in the first place, there's no need to disable them later at
kexec time.  Hence skip the test in the absence of SMP bringup code.

This allows to add all CPU cores to the DTS from the beginning, without
having to implement SMP bringup first, improving DT compatibility.
Signed-off-by: NGeert Uytterhoeven <geert+renesas@glider.be>
Acked-by: NStephen Warren <swarren@nvidia.com>
Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>

fee3fd4f

31 3月, 2015 1 次提交

KVM: arm/arm64: rework MMIO abort handling to use KVM MMIO bus · 950324ab

由 Andre Przywara 提交于 3月 28, 2015

Currently we have struct kvm_exit_mmio for encapsulating MMIO abort
data to be passed on from syndrome decoding all the way down to the
VGIC register handlers. Now as we switch the MMIO handling to be
routed through the KVM MMIO bus, it does not make sense anymore to
use that structure already from the beginning. So we keep the data in
local variables until we put them into the kvm_io_bus framework.
Then we fill kvm_exit_mmio in the VGIC only, making it a VGIC private
structure. On that way we replace the data buffer in that structure
with a pointer pointing to a single location in a local variable, so
we get rid of some copying on the way.
With all of the virtual GIC emulation code now being registered with
the kvm_io_bus, we can remove all of the old MMIO handling code and
its dispatching functionality.

I didn't bother to rename kvm_exit_mmio (to vgic_mmio or something),
because that touches a lot of code lines without any good reason.

This is based on an original patch by Nikolay.
Signed-off-by: NAndre Przywara <andre.przywara@arm.com>
Cc: Nikolay Nikolaev <n.nikolaev@virtualopensystems.com>
Reviewed-by: NMarc Zyngier <marc.zyngier@arm.com>
Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>

950324ab

30 3月, 2015 1 次提交

ARM: 8322/1: keep .text and .fixup regions closer together · c4a84ae3

由 Ard Biesheuvel 提交于 3月 24, 2015

This moves all fixup snippets to the .text.fixup section, which is
a special section that gets emitted along with the .text section
for each input object file, i.e., the snippets are kept much closer
to the code they refer to, which helps prevent linker failure on
large kernels.
Acked-by: NNicolas Pitre <nico@linaro.org>
Signed-off-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>

c4a84ae3

29 3月, 2015 1 次提交

ARM: 8327/1: zImage: add support for ARMv7-M · c20611df

由 Joachim Eastwood 提交于 3月 25, 2015

This patch makes it possible to enter zImage in Thumb mode for ARMv7-M
(Cortex-M) CPUs that do not support ARM mode. The kernel entry is also
made in Thumb mode.

[ukl: fix spelling in commit log, return early in call_cache_fn]
Signed-off-by: NJoachim Eastwood <manabian@gmail.com>
Tested-by: NStefan Agner <stefan@agner.ch>
Tested-by: NEzequiel Garcia <ezequiel@vanguardiasur.com.ar>
Tested-by: NChanwoo Choi <cw00.choi@samsung.com>
Signed-off-by: NUwe Kleine-König <u.kleine-koenig@pengutronix.de>
Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>

c20611df

28 3月, 2015 4 次提交

ARM: 8320/1: fix integer overflow in ELF_ET_DYN_BASE · 8defb336

由 Andrey Ryabinin 提交于 3月 20, 2015

Usually ELF_ET_DYN_BASE is 2/3 of TASK_SIZE. With 3G/1G user/kernel
split this is not so, because 2*TASK_SIZE overflows 32 bits,
so the actual value of ELF_ET_DYN_BASE is:
	(2 * TASK_SIZE / 3) = 0x2a000000

When ASLR is disabled PIE binaries will load at ELF_ET_DYN_BASE address.
On 32bit platforms AddressSanitzer uses addresses [0x20000000 - 0x40000000]
for shadow memory [1]. So ASan doesn't work for PIE binaries when ASLR disabled
as it fails to map shadow memory.
Also after Kees's 'split ET_DYN ASLR from mmap ASLR' patchset PIE binaries
has a high chance of loading somewhere in between [0x2a000000 - 0x40000000]
even if ASLR enabled. This makes ASan with PIE absolutely incompatible.

Fix overflow by dividing TASK_SIZE prior to multiplying.
After this patch ELF_ET_DYN_BASE equals to (for CONFIG_VMSPLIT_3G=y):
	(TASK_SIZE / 3 * 2) = 0x7f555554

[1] https://code.google.com/p/address-sanitizer/wiki/AddressSanitizerAlgorithm#MappingSigned-off-by: NAndrey Ryabinin <a.ryabinin@samsung.com>
Reported-by: NMaria Guseva <m.guseva@samsung.com>
Cc: stable@vger.kernel.org
Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>

8defb336

ARM: 8318/1: treat CPU feature register fields as signed quantities · b8c9592b

由 Ard Biesheuvel 提交于 3月 19, 2015

The various CPU feature registers consist of 4-bit blocks that
represent signed quantities, whose positive values represent
incremental features, and whose negative values are reserved.

To improve forward compatibility, update the feature detection
code to take possible future higher values into account, but
ignore negative values.
Signed-off-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>

b8c9592b

ARM: 8329/1: miscellaneous vdso infrastructure, preparation · 1713ce7c

由 Nathan Lynch 提交于 3月 25, 2015

Define the layout of the data structure shared between kernel and
userspace.

Track the vdso address in the mm_context; needed for communicating
AT_SYSINFO_EHDR to the ELF loader.

Add declarations for arm_install_vdso; implementation is in a
following patch.

Define AT_SYSINFO_EHDR, and, if CONFIG_VDSO=y, report the vdso shared
object address via the ELF auxiliary vector.

Note - this adds the AT_SYSINFO_EHDR in a new user-visible header
asm/auxvec.h; this is consistent with other architectures.
Signed-off-by: NNathan Lynch <nathan_lynch@mentor.com>
Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>

1713ce7c

ARM: Remove mach-msm and associated ARM architecture code · c0c89faf

由 Stephen Boyd 提交于 3月 13, 2015

The maintainers for mach-msm no longer have any plans to support
or test the platforms supported by this architecture[1]. Most likely
there aren't any active users of this code anyway, so let's
delete it.

[1] http://lkml.kernel.org/r/20150307031212.GA8434@fifo99.com
Cc: David Brown <davidb@codeaurora.org>
Cc: Bryan Huntsman <bryanh@codeaurora.org>
Cc: Daniel Walker <dwalker@fifo99.com>
Signed-off-by: NStephen Boyd <sboyd@codeaurora.org>
Signed-off-by: NKumar Gala <galak@codeaurora.org>

c0c89faf

27 3月, 2015 1 次提交

arm-cci: Get rid of secure transactions for PMU driver · 772742a6

由 Suzuki K. Poulose 提交于 3月 18, 2015

Avoid secure transactions while probing the CCI PMU. The
existing code makes use of the Peripheral ID2 (PID2) register
to determine the revision of the CCI400, which requires a
secure transaction. This puts a limitation on the usage of the
driver on systems running non-secure Linux(e.g, ARM64).

Updated the device-tree binding for cci pmu node to add the explicit
revision number for the compatible field.

The supported strings are :
	arm,cci-400-pmu,r0
	arm,cci-400-pmu,r1
	arm,cci-400-pmu - DEPRECATED. See NOTE below

NOTE: If the revision is not mentioned, we need to probe the cci revision,
which could be fatal on a platform running non-secure. We need a reliable way
to know if we can poke the CCI registers at runtime on ARM32. We depend on
'mcpm_is_available()' when it is available. mcpm_is_available() returns true
only when there is a registered driver for mcpm. Otherwise, we assume that we
don't have secure access, and skips probing the revision number(ARM64 case).

The MCPM should figure out if it is safe to access the CCI. Unfortunately
there isn't a reliable way to indicate the same via dtb. This patch doesn't
address/change the current situation. It only deals with the CCI-PMU, leaving
the assumptions about the secure access as it has been, prior to this patch.

Cc: devicetree@vger.kernel.org
Cc: Punit Agrawal <punit.agrawal@arm.com>
Tested-by: NSudeep Holla <sudeep.holla@arm.com>
Acked-by: NNicolas Pitre <nicolas.pitre@linaro.org>
Acked-by: NMark Rutland <mark.rutland@arm.com>
Signed-off-by: NSuzuki K. Poulose <suzuki.poulose@arm.com>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

772742a6

24 3月, 2015 3 次提交

ARM: pmu: add support for interrupt-affinity property · 9fd85eb5

由 Will Deacon 提交于 3月 06, 2015

Historically, the PMU devicetree bindings have expected SPIs to be
listed in order of *logical* CPU number. This is problematic for
bootloaders, especially when the boot CPU (logical ID 0) isn't listed
first in the devicetree.

This patch adds a new optional property, interrupt-affinity, to the
PMU node which allows the interrupt affinity to be described using
a list of phandled to CPU nodes, with each entry in the list
corresponding to the SPI at the same index in the interrupts property.

Cc: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

9fd85eb5

ARM: cpuidle: Add a cpuidle ops structure to be used for DT · 449e056c

由 Daniel Lezcano 提交于 2月 02, 2015

The current state of the different cpuidle drivers is the different PM
operations are passed via the platform_data using the platform driver
paradigm.

This approach allowed to split the low level PM code from the arch specific
and the generic cpuidle code.

Unfortunately there are complaints about this approach as, in the context of the
single kernel image, we have multiple drivers loaded in memory for nothing and
the platform driver is not adequate for cpuidle.

This patch provides a common interface via cpuidle ops for all new cpuidle
driver and a definition for the device tree.

It will allow with the next patches to a have a common definition with ARM64
and share the same cpuidle driver.

The code is optimized to use the __init section intensively in order to reduce
the memory footprint after the driver is initialized and unify the function
names with ARM64.
Signed-off-by: NDaniel Lezcano <daniel.lezcano@linaro.org>
Acked-by: NKevin Hilman <khilman@linaro.org>
Acked-by: NRob Herring <robherring2@gmail.com>
Acked-by: NCatalin Marinas <catalin.marinas@arm.com>
Tested-by: NLorenzo Pieralisi <lorenzo.pieralisi@arm.com>

449e056c

ARM: cpuidle: Remove duplicate header inclusion · eeebc3bb

由 Daniel Lezcano 提交于 2月 02, 2015

The cpu_do_idle() function is always used by the cpuidle drivers.

That led to have each driver including cpuidle.h and proc-fns.h, they are
always paired. That makes a lot of duplicate headers inclusion. Instead of
including both in each .c file, move the proc-fns.h header inclusion in the
cpuidle.h header file directly, so we can save some line of code.
Signed-off-by: NDaniel Lezcano <daniel.lezcano@linaro.org>
Acked-by: NKevin Hilman <khilman@linaro.org>
Acked-by: NLorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Tested-by: NLorenzo Pieralisi <lorenzo.pieralisi@arm.com>

eeebc3bb

23 3月, 2015 1 次提交

arm64: KVM: use ID map with increased VA range if required · e4c5a685

由 Ard Biesheuvel 提交于 3月 19, 2015

This patch modifies the HYP init code so it can deal with system
RAM residing at an offset which exceeds the reach of VA_BITS.

Like for EL1, this involves configuring an additional level of
translation for the ID map. However, in case of EL2, this implies
that all translations use the extra level, as we cannot seamlessly
switch between translation tables with different numbers of
translation levels.

So add an extra translation table at the root level. Since the
ID map and the runtime HYP map are guaranteed not to overlap, they
can share this root level, and we can essentially merge these two
tables into one.
Tested-by: NMarc Zyngier <marc.zyngier@arm.com>
Reviewed-by: NMarc Zyngier <marc.zyngier@arm.com>
Signed-off-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

e4c5a685

13 3月, 2015 1 次提交

arm/arm64: KVM: Implement Stage-2 page aging · 35307b9a

由 Marc Zyngier 提交于 3月 12, 2015

Until now, KVM/arm didn't care much for page aging (who was swapping
anyway?), and simply provided empty hooks to the core KVM code. With
server-type systems now being available, things are quite different.

This patch implements very simple support for page aging, by clearing
the Access flag in the Stage-2 page tables. On access fault, the current
fault handling will write the PTE or PMD again, putting the Access flag
back on.

It should be possible to implement a much faster handling for Access
faults, but that's left for a later patch.

With this in place, performance in VMs is degraded much more gracefully.
Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
Acked-by: NChristoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org>

35307b9a

12 3月, 2015 2 次提交

KVM: arm/arm64: add irqfd support · 174178fe

由 Eric Auger 提交于 3月 04, 2015

This patch enables irqfd on arm/arm64.

Both irqfd and resamplefd are supported. Injection is implemented
in vgic.c without routing.

This patch enables CONFIG_HAVE_KVM_EVENTFD and CONFIG_HAVE_KVM_IRQFD.

KVM_CAP_IRQFD is now advertised. KVM_CAP_IRQFD_RESAMPLE capability
automatically is advertised as soon as CONFIG_HAVE_KVM_IRQFD is set.

Irqfd injection is restricted to SPI. The rationale behind not
supporting PPI irqfd injection is that any device using a PPI would
be a private-to-the-CPU device (timer for instance), so its state
would have to be context-switched along with the VCPU and would
require in-kernel wiring anyhow. It is not a relevant use case for
irqfds.
Signed-off-by: NEric Auger <eric.auger@linaro.org>
Reviewed-by: NChristoffer Dall <christoffer.dall@linaro.org>
Acked-by: NMarc Zyngier <marc.zyngier@arm.com>
Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org>

174178fe

KVM: arm/arm64: implement kvm_arch_intc_initialized · c1426e4c

由 Eric Auger 提交于 3月 04, 2015

On arm/arm64 the VGIC is dynamically instantiated and it is useful
to expose its state, especially for irqfd setup.

This patch defines __KVM_HAVE_ARCH_INTC_INITIALIZED and
implements kvm_arch_intc_initialized.
Signed-off-by: NEric Auger <eric.auger@linaro.org>
Acked-by: NChristoffer Dall <christoffer.dall@linaro.org>
Reviewed-by: NAndre Przywara <andre.przywara@arm.com>
Acked-by: NMarc Zyngier <marc.zyngier@arm.com>
Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org>

c1426e4c

11 3月, 2015 2 次提交

arm64: KVM: Do not use pgd_index to index stage-2 pgd · 04b8dc85

由 Marc Zyngier 提交于 3月 10, 2015

The kernel's pgd_index macro is designed to index a normal, page
sized array. KVM is a bit diffferent, as we can use concatenated
pages to have a bigger address space (for example 40bit IPA with
4kB pages gives us an 8kB PGD.

In the above case, the use of pgd_index will always return an index
inside the first 4kB, which makes a guest that has memory above
0x8000000000 rather unhappy, as it spins forever in a page fault,
whist the host happilly corrupts the lower pgd.

The obvious fix is to get our own kvm_pgd_index that does the right
thing(tm).

Tested on X-Gene with a hacked kvmtool that put memory at a stupidly
high address.
Reviewed-by: NChristoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org>

04b8dc85

arm64: KVM: Fix stage-2 PGD allocation to have per-page refcounting · a987370f

由 Marc Zyngier 提交于 3月 10, 2015

We're using __get_free_pages with to allocate the guest's stage-2
PGD. The standard behaviour of this function is to return a set of
pages where only the head page has a valid refcount.

This behaviour gets us into trouble when we're trying to increment
the refcount on a non-head page:

page:ffff7c00cfb693c0 count:0 mapcount:0 mapping:          (null) index:0x0
flags: 0x4000000000000000()
page dumped because: VM_BUG_ON_PAGE((*({ __attribute__((unused)) typeof((&page->_count)->counter) __var = ( typeof((&page->_count)->counter)) 0; (volatile typeof((&page->_count)->counter) *)&((&page->_count)->counter); })) <= 0)
BUG: failure at include/linux/mm.h:548/get_page()!
Kernel panic - not syncing: BUG!
CPU: 1 PID: 1695 Comm: kvm-vcpu-0 Not tainted 4.0.0-rc1+ #3825
Hardware name: APM X-Gene Mustang board (DT)
Call trace:
[<ffff80000008a09c>] dump_backtrace+0x0/0x13c
[<ffff80000008a1e8>] show_stack+0x10/0x1c
[<ffff800000691da8>] dump_stack+0x74/0x94
[<ffff800000690d78>] panic+0x100/0x240
[<ffff8000000a0bc4>] stage2_get_pmd+0x17c/0x2bc
[<ffff8000000a1dc4>] kvm_handle_guest_abort+0x4b4/0x6b0
[<ffff8000000a420c>] handle_exit+0x58/0x180
[<ffff80000009e7a4>] kvm_arch_vcpu_ioctl_run+0x114/0x45c
[<ffff800000099df4>] kvm_vcpu_ioctl+0x2e0/0x754
[<ffff8000001c0a18>] do_vfs_ioctl+0x424/0x5c8
[<ffff8000001c0bfc>] SyS_ioctl+0x40/0x78
CPU0: stopping

A possible approach for this is to split the compound page using
split_page() at allocation time, and change the teardown path to
free one page at a time.  It turns out that alloc_pages_exact() and
free_pages_exact() does exactly that.

While we're at it, the PGD allocation code is reworked to reduce
duplication.

This has been tested on an X-Gene platform with a 4kB/48bit-VA host
kernel, and kvmtool hacked to place memory in the second page of
the hardware PGD (PUD for the host kernel). Also regression-tested
on a Cubietruck (Cortex-A7).

 [ Reworked to use alloc_pages_exact() and free_pages_exact() and to
   return pointers directly instead of by reference as arguments
    - Christoffer ]
Reported-by: NMark Rutland <mark.rutland@arm.com>
Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org>

a987370f

05 3月, 2015 1 次提交

ARM: at91: debug: fix non MMU debug · 5a5a6451

由 Alexandre Belloni 提交于 3月 04, 2015

Linux may be used without MMU on atmel SoCs, fix debug in this configuration.
Signed-off-by: NAlexandre Belloni <alexandre.belloni@free-electrons.com>
Signed-off-by: NNicolas Ferre <nicolas.ferre@atmel.com>

5a5a6451

24 2月, 2015 1 次提交

ARM: KVM: Fix size check in __coherent_cache_guest_page · a050dfb2

由 Jan Kiszka 提交于 2月 07, 2015

The check is supposed to catch page-unaligned sizes, not the inverse.
Signed-off-by: NJan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org>

a050dfb2

13 2月, 2015 2 次提交

all arches, signal: move restart_block to struct task_struct · f56141e3

由 Andy Lutomirski 提交于 2月 12, 2015

If an attacker can cause a controlled kernel stack overflow, overwriting
the restart block is a very juicy exploit target.  This is because the
restart_block is held in the same memory allocation as the kernel stack.

Moving the restart block to struct task_struct prevents this exploit by
making the restart_block harder to locate.

Note that there are other fields in thread_info that are also easy
targets, at least on some architectures.

It's also a decent simplification, since the restart code is more or less
identical on all architectures.

[james.hogan@imgtec.com: metag: align thread_info::supervisor_stack]
Signed-off-by: NAndy Lutomirski <luto@amacapital.net>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: David Miller <davem@davemloft.net>
Acked-by: NRichard Weinberger <richard@nod.at>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Vineet Gupta <vgupta@synopsys.com>
Cc: Russell King <rmk@arm.linux.org.uk>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Haavard Skinnemoen <hskinnemoen@gmail.com>
Cc: Hans-Christian Egtvedt <egtvedt@samfundet.no>
Cc: Steven Miao <realmz6@gmail.com>
Cc: Mark Salter <msalter@redhat.com>
Cc: Aurelien Jacquiot <a-jacquiot@ti.com>
Cc: Mikael Starvik <starvik@axis.com>
Cc: Jesper Nilsson <jesper.nilsson@axis.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Richard Kuo <rkuo@codeaurora.org>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Jonas Bonn <jonas@southpole.se>
Cc: "James E.J. Bottomley" <jejb@parisc-linux.org>
Cc: Helge Deller <deller@gmx.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Acked-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc)
Tested-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc)
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Chen Liqin <liqin.linux@gmail.com>
Cc: Lennox Wu <lennox.wu@gmail.com>
Cc: Chris Metcalf <cmetcalf@ezchip.com>
Cc: Guan Xuetao <gxt@mprc.pku.edu.cn>
Cc: Chris Zankel <chris@zankel.net>
Cc: Max Filippov <jcmvbkbc@gmail.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: NJames Hogan <james.hogan@imgtec.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

f56141e3

mm: convert p[te|md]_mknonnuma and remaining page table manipulations · 4d942466

由 Mel Gorman 提交于 2月 12, 2015

With PROT_NONE, the traditional page table manipulation functions are
sufficient.

[andre.przywara@arm.com: fix compiler warning in pmdp_invalidate()]
[akpm@linux-foundation.org: fix build with STRICT_MM_TYPECHECKS]
Signed-off-by: NMel Gorman <mgorman@suse.de>
Acked-by: NLinus Torvalds <torvalds@linux-foundation.org>
Acked-by: NAneesh Kumar <aneesh.kumar@linux.vnet.ibm.com>
Tested-by: NSasha Levin <sasha.levin@oracle.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Dave Jones <davej@redhat.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Kirill Shutemov <kirill.shutemov@linux.intel.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Rik van Riel <riel@redhat.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

4d942466

12 2月, 2015 2 次提交

arm: define __PAGETABLE_PMD_FOLDED for !LPAE · 8aa76875

由 Kirill A. Shutemov 提交于 2月 11, 2015

ARM uses custom implementation of PMD folding in 2-level page table case.
Generic code expects to see __PAGETABLE_PMD_FOLDED to be defined if PMD is
folded, but ARM doesn't do this.  Let's fix it.

Defining __PAGETABLE_PMD_FOLDED will drop out unused __pmd_alloc().  It
also fixes problems with recently-introduced pmd accounting on ARM without
LPAE.
Signed-off-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com>
Reported-by: NNishanth Menon <nm@ti.com>
Reported-by: NSimon Horman <horms@verge.net.au>
Tested-by: NSimon Horman <horms+renesas@verge.net.au>
Tested-by: NFabio Estevam <festevam@gmail.com>
Tested-by: NFelipe Balbi <balbi@ti.com>
Tested-by: NNishanth Menon <nm@ti.com>
Tested-by: NPeter Ujfalusi <peter.ujfalusi@ti.com>
Tested-by: NKrzysztof Kozlowski <k.kozlowski@samsung.com>
Tested-by: NGeert Uytterhoeven <geert+renesas@glider.be>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Cyrill Gorcunov <gorcunov@openvz.org>
Cc: Pavel Emelyanov <xemul@openvz.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Russell King <linux@arm.linux.org.uk>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

8aa76875

mm: make FIRST_USER_ADDRESS unsigned long on all archs · d016bf7e

由 Kirill A. Shutemov 提交于 2月 11, 2015

LKP has triggered a compiler warning after my recent patch "mm: account
pmd page tables to the process":

    mm/mmap.c: In function 'exit_mmap':
 >> mm/mmap.c:2857:2: warning: right shift count >= width of type [enabled by default]

The code:

 > 2857                WARN_ON(mm_nr_pmds(mm) >
   2858                                round_up(FIRST_USER_ADDRESS, PUD_SIZE) >> PUD_SHIFT);

In this, on tile, we have FIRST_USER_ADDRESS defined as 0.  round_up() has
the same type -- int.  PUD_SHIFT.

I think the best way to fix it is to define FIRST_USER_ADDRESS as unsigned
long.  On every arch for consistency.
Signed-off-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com>
Reported-by: NWu Fengguang <fengguang.wu@intel.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

d016bf7e

11 2月, 2015 1 次提交

arm: drop L_PTE_FILE and pte_file()-related helpers · b007ea79

由 Kirill A. Shutemov 提交于 2月 10, 2015

We've replaced remap_file_pages(2) implementation with emulation.  Nobody
creates non-linear mapping anymore.

This patch also adjust __SWP_TYPE_SHIFT, effectively increase size of
possible swap file to 128G.
Signed-off-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Russell King <linux@arm.linux.org.uk>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

b007ea79

10 2月, 2015 1 次提交

ARM: 8300/1: teach __asmeq that r11 == fp and r12 == ip · ada63d40

由 Ard Biesheuvel 提交于 1月 29, 2015

The __asmeq macro is used inside inline asm statements to ensure that
register asm variables that explicitly specify a register are mapped
correctly onto those registers when used in inline asm input and output
constraints. However, the string based matching fails to take into
account that 'fp' is often referred to as 'r11' and 'ip' is often
referred to as 'r12', (e.g., by clang), causing false negatives.

Fix this by making __asmeq consider the ("fp","r11"), ("r11","fp"),
("ip","r12") and ("r12","ip") cases specifically.
Reviewed-by: NAlex Elder <elder@linaro.org>
Acked-by: NNicolas Pitre <nico@linaro.org>
Signed-off-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>

ada63d40

06 2月, 2015 1 次提交

kvm: add halt_poll_ns module parameter · f7819512

由 Paolo Bonzini 提交于 2月 04, 2015

This patch introduces a new module parameter for the KVM module; when it
is present, KVM attempts a bit of polling on every HLT before scheduling
itself out via kvm_vcpu_block.

This parameter helps a lot for latency-bound workloads---in particular
I tested it with O_DSYNC writes with a battery-backed disk in the host.
In this case, writes are fast (because the data doesn't have to go all
the way to the platters) but they cannot be merged by either the host or
the guest. KVM's performance here is usually around 30% of bare metal,
or 50% if you use cache=directsync or cache=writethrough (these
parameters avoid that the guest sends pointless flush requests, and
at the same time they are not slow because of the battery-backed cache).
The bad performance happens because on every halt the host CPU decides
to halt itself too. When the interrupt comes, the vCPU thread is then
migrated to a new physical CPU, and in general the latency is horrible
because the vCPU thread has to be scheduled back in.

With this patch performance reaches 60-65% of bare metal and, more
important, 99% of what you get if you use idle=poll in the guest. This
means that the tunable gets rid of this particular bottleneck, and more
work can be done to improve performance in the kernel or QEMU.

Of course there is some price to pay; every time an otherwise idle vCPUs
is interrupted by an interrupt, it will poll unnecessarily and thus
impose a little load on the host. The above results were obtained with
a mostly random value of the parameter (500000), and the load was around
1.5-2.5% CPU usage on one of the host's core for each idle guest vCPU.

The patch also adds a new stat, /sys/kernel/debug/kvm/halt_successful_poll,
that can be used to tune the parameter. It counts how many HLT
instructions received an interrupt during the polling period; each
successful poll avoids that Linux schedules the VCPU thread out and back
in, and may also avoid a likely trip to C1 and back for the physical CPU.

While the VM is idle, a Linux 4 VCPU VM halts around 10 times per second.
Of these halts, almost all are failed polls. During the benchmark,
instead, basically all halts end within the polling period, except a more
or less constant stream of 50 per second coming from vCPUs that are not
running the benchmark. The wasted time is thus very low. Things may
be slightly different for Windows VMs, which have a ~10 ms timer tick.

The effect is also visible on Marcelo's recently-introduced latency
test for the TSC deadline timer. Though of course a non-RT kernel has
awful latency bounds, the latency of the timer is around 8000-10000 clock
cycles compared to 20000-120000 without setting halt_poll_ns. For the TSC
deadline timer, thus, the effect is both a smaller average latency and
a smaller variance.
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

f7819512

30 1月, 2015 3 次提交

arm/arm64: KVM: Use kernel mapping to perform invalidation on page fault · 0d3e4d4f

由 Marc Zyngier 提交于 1月 05, 2015

When handling a fault in stage-2, we need to resync I$ and D$, just
to be sure we don't leave any old cache line behind.

That's very good, except that we do so using the *user* address.
Under heavy load (swapping like crazy), we may end up in a situation
where the page gets mapped in stage-2 while being unmapped from
userspace by another CPU.

At that point, the DC/IC instructions can generate a fault, which
we handle with kvm->mmu_lock held. The box quickly deadlocks, user
is unhappy.

Instead, perform this invalidation through the kernel mapping,
which is guaranteed to be present. The box is much happier, and so
am I.
Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org>

0d3e4d4f

arm/arm64: KVM: Invalidate data cache on unmap · 363ef89f

由 Marc Zyngier 提交于 12月 19, 2014

Let's assume a guest has created an uncached mapping, and written
to that page. Let's also assume that the host uses a cache-coherent
IO subsystem. Let's finally assume that the host is under memory
pressure and starts to swap things out.

Before this "uncached" page is evicted, we need to make sure
we invalidate potential speculated, clean cache lines that are
sitting there, or the IO subsystem is going to swap out the
cached view, loosing the data that has been written directly
into memory.
Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org>

363ef89f

arm/arm64: KVM: Use set/way op trapping to track the state of the caches · 3c1e7165

由 Marc Zyngier 提交于 12月 19, 2014

Trying to emulate the behaviour of set/way cache ops is fairly
pointless, as there are too many ways we can end-up missing stuff.
Also, there is some system caches out there that simply ignore
set/way operations.

So instead of trying to implement them, let's convert it to VA ops,
and use them as a way to re-enable the trapping of VM ops. That way,
we can detect the point when the MMU/caches are turned off, and do
a full VM flush (which is what the guest was trying to do anyway).

This allows a 32bit zImage to boot on the APM thingy, and will
probably help bootloaders in general.
Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org>

3c1e7165

28 1月, 2015 2 次提交

xen/grant-table: pre-populate kernel unmap ops for xen_gnttab_unmap_refs() · 853d0289

由 David Vrabel 提交于 1月 05, 2015

When unmapping grants, instead of converting the kernel map ops to
unmap ops on the fly, pre-populate the set of unmap ops.

This allows the grant unmap for the kernel mappings to be trivially
batched in the future.
Signed-off-by: NDavid Vrabel <david.vrabel@citrix.com>
Reviewed-by: NStefano Stabellini <stefano.stabellini@eu.citrix.com>

853d0289

ARM: digicolor: add low level debug support · e23814da

由 Baruch Siach 提交于 1月 14, 2015

Use the USART peripheral as UART for low level debug. Only the UA0 port is
currently supported.
Acked-by: NArnd Bergmann <arnd@arndb.de>
Signed-off-by: NBaruch Siach <baruch@tkos.co.il>
Signed-off-by: NOlof Johansson <olof@lixom.net>

e23814da

openanolis / cloud-kernel 大约 1 年 前同步成功

openanolis / cloud-kernel
大约 1 年前同步成功