提交 · 116c81f427ff6c5380850963e3fb8798cc821d2b · openanolis / cloud-kernel

09 9月, 2016 31 次提交

arm64: Work around systems with mismatched cache line sizes · 116c81f4

由 Suzuki K Poulose 提交于 9月 09, 2016

Systems with differing CPU i-cache/d-cache line sizes can cause
problems with the cache management by software when the execution
is migrated from one to another. Usually, the application reads
the cache size on a CPU and then uses that length to perform cache
operations. However, if it gets migrated to another CPU with a smaller
cache line size, things could go completely wrong. To prevent such
cases, always use the smallest cache line size among the CPUs. The
kernel CPU feature infrastructure already keeps track of the safe
value for all CPUID registers including CTR. This patch works around
the problem by :

For kernel, dynamically patch the kernel to read the cache size
from the system wide copy of CTR_EL0.

For applications, trap read accesses to CTR_EL0 (by clearing the SCTLR.UCT)
and emulate the mrs instruction to return the system wide safe value
of CTR_EL0.

For faster access (i.e, avoiding to lookup the system wide value of CTR_EL0
via read_system_reg), we keep track of the pointer to table entry for
CTR_EL0 in the CPU feature infrastructure.

Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Andre Przywara <andre.przywara@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: NSuzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

116c81f4

arm64: Refactor sysinstr exception handling · 9dbd5bb2

由 Suzuki K Poulose 提交于 9月 09, 2016

Right now we trap some of the user space data cache operations
based on a few Errata (ARM 819472, 826319, 827319 and 824069).
We need to trap userspace access to CTR_EL0, if we detect mismatched
cache line size. Since both these traps share the EC, refactor
the handler a little bit to make it a bit more reader friendly.

Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: NAndre Przywara <andre.przywara@arm.com>
Signed-off-by: NSuzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

9dbd5bb2

arm64: Introduce raw_{d,i}cache_line_size · 072f0a63

由 Suzuki K Poulose 提交于 9月 09, 2016

On systems with mismatched i/d cache min line sizes, we need to use
the smallest size possible across all CPUs. This will be done by fetching
the system wide safe value from CPU feature infrastructure.
However the some special users(e.g kexec, hibernate) would need the line
size on the CPU (rather than the system wide), when either the system
wide feature may not be accessible or it is guranteed that the caller
executes with a gurantee of no migration.
Provide another helper which will fetch cache line size on the current CPU.

Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: NJames Morse <james.morse@arm.com>
Reviewed-by: NGeoff Levand <geoff@infradead.org>
Signed-off-by: NSuzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

072f0a63

arm64: alternative: Add support for patching adrp instructions · c831b2ae

由 Suzuki K Poulose 提交于 9月 09, 2016

adrp uses PC-relative address offset to a page (of 4K size) of
a symbol. If it appears in an alternative code patched in, we
should adjust the offset to reflect the address where it will
be run from. This patch adds support for fixing the offset
for adrp instructions.

Cc: Will Deacon <will.deacon@arm.com>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Andre Przywara <andre.przywara@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: NSuzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

c831b2ae

arm64: insn: Add helpers for adrp offsets · 46084bc2

由 Suzuki K Poulose 提交于 9月 09, 2016

Adds helpers for decoding/encoding the PC relative addresses for adrp.
This will be used for handling dynamic patching of 'adrp' instructions
in alternative code patching.

Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: NSuzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

46084bc2

arm64: alternative: Disallow patching instructions using literals · baa763b5

由 Suzuki K Poulose 提交于 9月 09, 2016

The alternative code patching doesn't check if the replaced instruction
uses a pc relative literal. This could cause silent corruption in the
instruction stream as the instruction will be executed from a different
address than what it was compiled for. Catch all such cases.

Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Andre Przywara <andre.przywara@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Suggested-by: NWill Deacon <will.deacon@arm.com>
Signed-off-by: NSuzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

baa763b5

arm64: Rearrange CPU errata workaround checks · c47a1900

由 Suzuki K Poulose 提交于 9月 09, 2016

Right now we run through the work around checks on a CPU
from __cpuinfo_store_cpu. There are some problems with that:

1) We initialise the system wide CPU feature registers only after the
Boot CPU updates its cpuinfo. Now, if a work around depends on the
variance of a CPU ID feature (e.g, check for Cache Line size mismatch),
we have no way of performing it cleanly for the boot CPU.

2) It is out of place, invoked from __cpuinfo_store_cpu() in cpuinfo.c. It
is not an obvious place for that.

This patch rearranges the CPU specific capability(aka work around) checks.

1) At the moment we use verify_local_cpu_capabilities() to check if a new
CPU has all the system advertised features. Use this for the secondary CPUs
to perform the work around check. For that we rename
  verify_local_cpu_capabilities() => check_local_cpu_capabilities()
which:

   If the system wide capabilities haven't been initialised (i.e, the CPU
   is activated at the boot), update the system wide detected work arounds.

   Otherwise (i.e a CPU hotplugged in later) verify that this CPU conforms to the
   system wide capabilities.

2) Boot CPU updates the work arounds from smp_prepare_boot_cpu() after we have
initialised the system wide CPU feature values.

Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Andre Przywara <andre.przywara@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: NSuzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

c47a1900

arm64: Use consistent naming for errata handling · 89ba2645

由 Suzuki K Poulose 提交于 9月 09, 2016

This is a cosmetic change to rename the functions dealing with
the errata work arounds to be more consistent with their naming.

1) check_local_cpu_errata() => update_cpu_errata_workarounds()
check_local_cpu_errata() actually updates the system's errata work
arounds. So rename it to reflect the same.

2) verify_local_cpu_errata() => verify_local_cpu_errata_workarounds()
Use errata_workarounds instead of _errata.

Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: NAndre Przywara <andre.przywara@arm.com>
Signed-off-by: NSuzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

89ba2645

arm64: Set the safe value for L1 icache policy · ee7bc638

由 Suzuki K Poulose 提交于 9月 09, 2016

Right now we use 0 as the safe value for CTR_EL0:L1Ip, which is
not defined at the moment. The safer value for the L1Ip should be
the weakest of the policies, which happens to be AIVIVT. While at it,
fix the comment about safe_val.

Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: NSuzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

ee7bc638

arm64/numa: remove the limitation that cpu0 must bind to node0 · 7ba5f605

由 Zhen Lei 提交于 9月 01, 2016

1. Remove the old binding code.
2. Read the nid of cpu0 from dts.
3. Fallback the nid of cpu0 to 0 when numa=off is set in bootargs.
Signed-off-by: NZhen Lei <thunder.leizhen@huawei.com>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

7ba5f605

arm64/numa: remove some useless code · df7ffa34

由 Zhen Lei 提交于 9月 01, 2016

When the deleted code is executed, only the bit of cpu0 was set on
cpu_possible_mask. So that, only set_cpu_numa_node(0, NUMA_NO_NODE); will
be executed. And map_cpu_to_node(0, 0) will soon be called. So these code
can be safely removed.
Signed-off-by: NZhen Lei <thunder.leizhen@huawei.com>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

df7ffa34

arm64/numa: support HAVE_SETUP_PER_CPU_AREA · 7af3a0a9

由 Zhen Lei 提交于 9月 01, 2016

To make each percpu area allocated from its local numa node. Without this
patch, all percpu areas will be allocated from the node which cpu0 belongs
to.
Signed-off-by: NZhen Lei <thunder.leizhen@huawei.com>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

7af3a0a9

arm64: numa: Use pr_fmt() · f11c7bac

由 Kefeng Wang 提交于 9月 01, 2016

Use pr_fmt to prefix kernel output, and remove duplicated msg
of NUMA turned off.
Signed-off-by: NKefeng Wang <wangkefeng.wang@huawei.com>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

f11c7bac

of_numa: Use pr_fmt() · ad021805

由 Kefeng Wang 提交于 9月 01, 2016

Use pr_fmt to prefix kernel output.
Signed-off-by: NKefeng Wang <wangkefeng.wang@huawei.com>
Acked-by: NRob Herring <robh@kernel.org>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

ad021805

of_numa: Use of_get_next_parent to simplify code · 837dae1b

由 Kefeng Wang 提交于 9月 01, 2016

Use of_get_next_parent() instead of open-code.
Signed-off-by: NKefeng Wang <wangkefeng.wang@huawei.com>
Acked-by: NRob Herring <robh@kernel.org>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

837dae1b

arm64/numa: avoid inconsistent information to be printed · 794224ea

由 Zhen Lei 提交于 9月 01, 2016

numa_init may return error because of numa configuration error. So "No
NUMA configuration found" is inaccurate. In fact, specific configuration
error information should be immediately printed by the testing branch.
Signed-off-by: NZhen Lei <thunder.leizhen@huawei.com>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

794224ea

of/numa: remove a duplicated warning · 9787ed6e

由 Zhen Lei 提交于 9月 01, 2016

This warning has been printed in of_numa_parse_cpu_nodes before.
Signed-off-by: NZhen Lei <thunder.leizhen@huawei.com>
Acked-by: NRob Herring <robh@kernel.org>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

9787ed6e

of/numa: add nid check for memory block · 571a588f

由 Zhen Lei 提交于 9月 01, 2016

If the numa-id which was configured in memory@ devicetree node is greater
than MAX_NUMNODES, we should report a warning. We have done this for cpus
and distance-map dt nodes, this patch help them to be consistent.
Acked-by: NRob Herring <robh@kernel.org>
Signed-off-by: NZhen Lei <thunder.leizhen@huawei.com>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

571a588f

of/numa: fix a memory@ node can only contains one memory block · 84b14256

由 Zhen Lei 提交于 9月 01, 2016

For a normal memory@ devicetree node, its reg property can contains more
memory blocks.

Because we don't known how many memory blocks maybe contained, so we try
from index=0, increase 1 until error returned(the end).
Signed-off-by: NZhen Lei <thunder.leizhen@huawei.com>
Acked-by: NRob Herring <robh@kernel.org>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

84b14256

of/numa: remove a duplicated pr_debug information · 16a82f06

由 Zhen Lei 提交于 9月 01, 2016

This information will be printed in the subfunction numa_add_memblk.
They are not the same, but very similar.
Signed-off-by: NZhen Lei <thunder.leizhen@huawei.com>
Acked-by: NRob Herring <robh@kernel.org>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

16a82f06

drivers/perf: arm_pmu: expose a cpumask in sysfs · 48538b58

由 Mark Rutland 提交于 9月 09, 2016

In systems with heterogeneous CPUs, there are multiple logical CPU PMUs,
each of which covers a subset of CPUs in the system. In some cases
userspace needs to know which CPUs a given logical PMU covers, so we'd
like to expose a cpumask under sysfs, similar to what is done for uncore
PMUs.

Unfortunately, prior to commit 00e727bb ("perf stat: Balance
opening and reading events"), perf stat only correctly handled a cpumask
holding a single CPU, and only when profiling in system-wide mode. In
other cases, the presence of a cpumask file could cause perf stat to
behave erratically.

Thus, exposing a cpumask file would break older perf binaries in cases
where they would otherwise work.

To avoid this issue while still providing userspace with the information
it needs, this patch exposes a differently-named file (cpus) under
sysfs. New tools can look for this and operate correctly, while older
tools will not be adversely affected by its presence.
Signed-off-by: NMark Rutland <mark.rutland@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

48538b58

drivers/perf: arm_pmu: only use common attr_groups · 1589680d

由 Mark Rutland 提交于 9月 09, 2016

Now that the 32-bit and 64-bit perf backends use the common groups
directly, remove the fallback and no longer allow the groups array to be
overridden.
Signed-off-by: NMark Rutland <mark.rutland@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

1589680d

arm: perf: move to common attr_group fields · 9268c5da

由 Mark Rutland 提交于 9月 09, 2016

By using a common attr_groups array, the common arm_pmu code can set up
common files (e.g. cpumask) for us in subsequent patches.
Signed-off-by: NMark Rutland <mark.rutland@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

9268c5da

arm64: perf: move to common attr_group fields · 569de902

由 Mark Rutland 提交于 9月 09, 2016

By using a common attr_groups array, the common arm_pmu code can set up
common files (e.g. cpumask) for us in subsequent patches.
Signed-off-by: NMark Rutland <mark.rutland@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

569de902

drivers/perf: arm_pmu: add common attr group fields · 86cdd72a

由 Mark Rutland 提交于 9月 09, 2016

In preparation for adding common attribute groups, add an array of
attribute group pointers to arm_pmu, which will be used if the
backend hasn't already set pmu::attr_groups.

Subsequent patches will move backends over to using these, before adding
common fields.
Signed-off-by: NMark Rutland <mark.rutland@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

86cdd72a

arm64: simplify contextidr_thread_switch · d3ea42aa