- 28 3月, 2018 3 次提交
-
-
由 Dave Martin 提交于
When the hardend usercopy support was added for arm64, it was concluded that all cases of usercopy into and out of thread_struct were statically sized and so didn't require explicit whitelisting of the appropriate fields in thread_struct. Testing with usercopy hardening enabled has revealed that this is not the case for certain ptrace regset manipulation calls on arm64. This occurs because the sizes of usercopies associated with the regset API are dynamic by construction, and because arm64 does not always stage such copies via the stack: indeed the regset API is designed to avoid the need for that by adding some bounds checking. This is currently believed to affect only the fpsimd and TLS registers. Because the whitelisted fields in thread_struct must be contiguous, this patch groups them together in a nested struct. It is also necessary to be able to determine the location and size of that struct, so rather than making the struct anonymous (which would save on edits elsewhere) or adding an anonymous union containing named and unnamed instances of the same struct (gross), this patch gives the struct a name and makes the necessary edits to code that references it (noisy but simple). Care is needed to ensure that the new struct does not contain padding (which the usercopy hardening would fail to protect). For this reason, the presence of tp2_value is made unconditional, since a padding field would be needed there in any case. This pads up to the 16-byte alignment required by struct user_fpsimd_state. Acked-by: NKees Cook <keescook@chromium.org> Reported-by: NMark Rutland <mark.rutland@arm.com> Fixes: 9e8084d3 ("arm64: Implement thread_struct whitelist for hardened usercopy") Signed-off-by: NDave Martin <Dave.Martin@arm.com> Signed-off-by: NWill Deacon <will.deacon@arm.com>
-
由 Dave Martin 提交于
In preparation for using a common representation of the FPSIMD state for tasks and KVM vcpus, this patch separates out the "cpu" field that is used to track the cpu on which the state was most recently loaded. This will allow common code to operate on task and vcpu contexts without requiring the cpu field to be stored at the same offset from the FPSIMD register data in both cases. This should avoid the need for messing with the definition of those parts of struct vcpu_arch that are exposed in the KVM user ABI. The resulting change is also convenient for grouping and defining the set of thread_struct fields that are supposed to be accessible to copy_{to,from}_user(), which includes user_fpsimd_state but should exclude the cpu field. This patch does not amend the usercopy whitelist to match: that will be addressed in a subsequent patch. Signed-off-by: NDave Martin <Dave.Martin@arm.com> [will: inline fpsimd_flush_state for now] Signed-off-by: NWill Deacon <will.deacon@arm.com>
-
由 Philip Elcan 提交于
Several of the bits of the TLBI register operand are RES0 per the ARM ARM, so TLBI operations should avoid writing non-zero values to these bits. This patch adds a macro __TLBI_VADDR(addr, asid) that creates the operand register in the correct format and honors the RES0 bits. Acked-by: NMark Rutland <mark.rutland@arm.com> Signed-off-by: NPhilip Elcan <pelcan@codeaurora.org> Signed-off-by: NWill Deacon <will.deacon@arm.com>
-
- 27 3月, 2018 20 次提交
-
-
由 Will Deacon 提交于
We need linux/compiler.h for unreachable(), so #include it here. Reported-by: NMark Rutland <mark.rutland@arm.com> Signed-off-by: NWill Deacon <will.deacon@arm.com>
-
由 Will Deacon 提交于
We want to avoid pulling linux/preempt.h into cmpxchg.h, since that can introduce a circular dependency on linux/bitops.h. linux/preempt.h is only needed by the per-cpu cmpxchg implementation, which is better off alongside the per-cpu xchg implementation in percpu.h, so move it there and add the missing #include. Reported-by: NMark Rutland <mark.rutland@arm.com> Signed-off-by: NWill Deacon <will.deacon@arm.com>
-
由 Will Deacon 提交于
Having asm/cmpxchg.h pull in linux/bug.h is problematic because this ends up pulling in the atomic bitops which themselves may be built on top of atomic.h and cmpxchg.h. Instead, just include build_bug.h for the definition of BUILD_BUG. Signed-off-by: NWill Deacon <will.deacon@arm.com>
-
由 Will Deacon 提交于
When the LL/SC atomics are moved out-of-line, they are annotated as notrace and exported to modules. Ensure we pull in the relevant include files so that these macros are defined when we need them. Acked-by: NMark Rutland <mark.rutland@arm.com> Signed-off-by: NWill Deacon <will.deacon@arm.com>
-
由 Will Deacon 提交于
fpsimd.h uses the __init annotation, so pull in linux/init.h Acked-by: NMark Rutland <mark.rutland@arm.com> Signed-off-by: NWill Deacon <will.deacon@arm.com>
-
由 Will Deacon 提交于
This reverts commit 1f85b42a. The internal dma-direct.h API has changed in -next, which collides with us trying to use it to manage non-coherent DMA devices on systems with unreasonably large cache writeback granules. This isn't at all trivial to resolve, so revert our changes for now and we can revisit this after the merge window. Effectively, this just restores our behaviour back to that of 4.16. Signed-off-by: NWill Deacon <will.deacon@arm.com>
-
由 Suzuki K Poulose 提交于
We enable hardware DBM bit in a capable CPU, very early in the boot via __cpu_setup. This doesn't give us a flexibility of optionally disable the feature, as the clearing the bit is a bit costly as the TLB can cache the settings. Instead, we delay enabling the feature until the CPU is brought up into the kernel. We use the feature capability mechanism to handle it. The hardware DBM is a non-conflicting feature. i.e, the kernel can safely run with a mix of CPUs with some using the feature and the others don't. So, it is safe for a late CPU to have this capability and enable it, even if the active CPUs don't. To get this handled properly by the infrastructure, we unconditionally set the capability and only enable it on CPUs which really have the feature. Also, we print the feature detection from the "matches" call back to make sure we don't mislead the user when none of the CPUs could use the feature. Cc: Catalin Marinas <catalin.marinas@arm.com> Reviewed-by: NDave Martin <dave.martin@arm.com> Signed-off-by: NSuzuki K Poulose <suzuki.poulose@arm.com> Signed-off-by: NWill Deacon <will.deacon@arm.com>
-
由 Suzuki K Poulose 提交于
Update the MIDR encodings for the Cortex-A55 and Cortex-A35 Cc: Mark Rutland <mark.rutland@arm.com> Reviewed-by: NDave Martin <dave.martin@arm.com> Signed-off-by: NSuzuki K Poulose <suzuki.poulose@arm.com> Signed-off-by: NWill Deacon <will.deacon@arm.com>
-
由 Suzuki K Poulose 提交于
Some capabilities have different criteria for detection and associated actions based on the matching criteria, even though they all share the same capability bit. So far we have used multiple entries with the same capability bit to handle this. This is prone to errors, as the cpu_enable is invoked for each entry, irrespective of whether the detection rule applies to the CPU or not. And also this complicates other helpers, e.g, __this_cpu_has_cap. This patch adds a wrapper entry to cover all the possible variations of a capability by maintaining list of matches + cpu_enable callbacks. To avoid complicating the prototypes for the "matches()", we use arm64_cpu_capabilities maintain the list and we ignore all the other fields except the matches & cpu_enable. This ensures : 1) The capabilitiy is set when at least one of the entry detects 2) Action is only taken for the entries that "matches". This avoids explicit checks in the cpu_enable() take some action. The only constraint here is that, all the entries should have the same "type" (i.e, scope and conflict rules). If a cpu_enable() method is associated with multiple matches for a single capability, care should be taken that either the match criteria are mutually exclusive, or that the method is robust against being called multiple times. This also reverts the changes introduced by commit 67948af4 ("arm64: capabilities: Handle duplicate entries for a capability"). Cc: Robin Murphy <robin.murphy@arm.com> Reviewed-by: NDave Martin <dave.martin@arm.com> Signed-off-by: NSuzuki K Poulose <suzuki.poulose@arm.com> Signed-off-by: NWill Deacon <will.deacon@arm.com>
-
由 Suzuki K Poulose 提交于
Add helpers for detecting an errata on list of midr ranges of affected CPUs, with the same work around. Cc: Will Deacon <will.deacon@arm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Reviewed-by: NDave Martin <dave.martin@arm.com> Signed-off-by: NSuzuki K Poulose <suzuki.poulose@arm.com> Signed-off-by: NWill Deacon <will.deacon@arm.com>
-
由 Suzuki K Poulose 提交于
Add helpers for checking if the given CPU midr falls in a range of variants/revisions for a given model. Cc: Will Deacon <will.deacon@arm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Reviewed-by: NDave Martin <dave.martin@arm.com> Signed-off-by: NSuzuki K Poulose <suzuki.poulose@arm.com> Signed-off-by: NWill Deacon <will.deacon@arm.com>
-
由 Suzuki K Poulose 提交于
We expect all CPUs to be running at the same EL inside the kernel with or without VHE enabled and we have strict checks to ensure that any mismatch triggers a kernel panic. If VHE is enabled, we use the feature based on the boot CPU and all other CPUs should follow. This makes it a perfect candidate for a capability based on the boot CPU, which should be matched by all the CPUs (both when is ON and OFF). This saves us some not-so-pretty hooks and special code, just for verifying the conflict. The patch also makes the VHE capability entry depend on CONFIG_ARM64_VHE. Cc: Marc Zyngier <marc.zyngier@arm.com> Cc: Will Deacon <will.deacon@arm.com> Reviewed-by: NDave Martin <dave.martin@arm.com> Signed-off-by: NSuzuki K Poulose <suzuki.poulose@arm.com> Signed-off-by: NWill Deacon <will.deacon@arm.com>
-
由 Suzuki K Poulose 提交于
The kernel detects and uses some of the features based on the boot CPU and expects that all the following CPUs conform to it. e.g, with VHE and the boot CPU running at EL2, the kernel decides to keep the kernel running at EL2. If another CPU is brought up without this capability, we use custom hooks (via check_early_cpu_features()) to handle it. To handle such capabilities add support for detecting and enabling capabilities based on the boot CPU. A bit is added to indicate if the capability should be detected early on the boot CPU. The infrastructure then ensures that such capabilities are probed and "enabled" early on in the boot CPU and, enabled on the subsequent CPUs. Cc: Julien Thierry <julien.thierry@arm.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Marc Zyngier <marc.zyngier@arm.com> Reviewed-by: NDave Martin <dave.martin@arm.com> Signed-off-by: NSuzuki K Poulose <suzuki.poulose@arm.com> Signed-off-by: NWill Deacon <will.deacon@arm.com>
-
由 Suzuki K Poulose 提交于
KPTI is treated as a system wide feature and is only detected if all the CPUs in the sysetm needs the defense, unless it is forced via kernel command line. This leaves a system with a mix of CPUs with and without the defense vulnerable. Also, if a late CPU needs KPTI but KPTI was not activated at boot time, the CPU is currently allowed to boot, which is a potential security vulnerability. This patch ensures that the KPTI is turned on if at least one CPU detects the capability (i.e, change scope to SCOPE_LOCAL_CPU). Also rejetcs a late CPU, if it requires the defense, when the system hasn't enabled it, Cc: Will Deacon <will.deacon@arm.com> Reviewed-by: NDave Martin <dave.martin@arm.com> Signed-off-by: NSuzuki K Poulose <suzuki.poulose@arm.com> Signed-off-by: NWill Deacon <will.deacon@arm.com>
-
由 Suzuki K Poulose 提交于
Now that we have the flexibility of defining system features based on individual CPUs, introduce CPU feature type that can be detected on a local SCOPE and ignores the conflict on late CPUs. This is applicable for ARM64_HAS_NO_HW_PREFETCH, where it is fine for the system to have CPUs without hardware prefetch turning up later. We only suffer a performance penalty, nothing fatal. Cc: Will Deacon <will.deacon@arm.com> Reviewed-by: NDave Martin <dave.martin@arm.com> Signed-off-by: NSuzuki K Poulose <suzuki.poulose@arm.com> Signed-off-by: NWill Deacon <will.deacon@arm.com>
-
由 Suzuki K Poulose 提交于
While processing the list of capabilities, it is useful to filter out some of the entries based on the given mask for the scope of the capabilities to allow better control. This can be used later for handling LOCAL vs SYSTEM wide capabilities and more. All capabilities should have their scope set to either LOCAL_CPU or SYSTEM. No functional/flow change. Cc: Will Deacon <will.deacon@arm.com> Cc: Mark Rutland <mark.rutland@arm.com> Reviewed-by: NDave Martin <dave.martin@arm.com> Signed-off-by: NSuzuki K Poulose <suzuki.poulose@arm.com> Signed-off-by: NWill Deacon <will.deacon@arm.com>
-
由 Suzuki K Poulose 提交于
When a CPU is brought up, it is checked against the caps that are known to be enabled on the system (via verify_local_cpu_capabilities()). Based on the state of the capability on the CPU vs. that of System we could have the following combinations of conflict. x-----------------------------x | Type | System | Late CPU | |-----------------------------| | a | y | n | |-----------------------------| | b | n | y | x-----------------------------x Case (a) is not permitted for caps which are system features, which the system expects all the CPUs to have (e.g VHE). While (a) is ignored for all errata work arounds. However, there could be exceptions to the plain filtering approach. e.g, KPTI is an optional feature for a late CPU as long as the system already enables it. Case (b) is not permitted for errata work arounds that cannot be activated after the kernel has finished booting.And we ignore (b) for features. Here, yet again, KPTI is an exception, where if a late CPU needs KPTI we are too late to enable it (because we change the allocation of ASIDs etc). Add two different flags to indicate how the conflict should be handled. ARM64_CPUCAP_PERMITTED_FOR_LATE_CPU - CPUs may have the capability ARM64_CPUCAP_OPTIONAL_FOR_LATE_CPU - CPUs may not have the cappability. Now that we have the flags to describe the behavior of the errata and the features, as we treat them, define types for ERRATUM and FEATURE. Cc: Will Deacon <will.deacon@arm.com> Cc: Mark Rutland <mark.rutland@arm.com> Reviewed-by: NDave Martin <dave.martin@arm.com> Signed-off-by: NSuzuki K Poulose <suzuki.poulose@arm.com> Signed-off-by: NWill Deacon <will.deacon@arm.com>
-
由 Suzuki K Poulose 提交于
We use arm64_cpu_capabilities to represent CPU ELF HWCAPs exposed to the userspace and the CPU hwcaps used by the kernel, which include cpu features and CPU errata work arounds. Capabilities have some properties that decide how they should be treated : 1) Detection, i.e scope : A cap could be "detected" either : - if it is present on at least one CPU (SCOPE_LOCAL_CPU) Or - if it is present on all the CPUs (SCOPE_SYSTEM) 2) When is it enabled ? - A cap is treated as "enabled" when the system takes some action based on whether the capability is detected or not. e.g, setting some control register, patching the kernel code. Right now, we treat all caps are enabled at boot-time, after all the CPUs are brought up by the kernel. But there are certain caps, which are enabled early during the boot (e.g, VHE, GIC_CPUIF for NMI) and kernel starts using them, even before the secondary CPUs are brought up. We would need a way to describe this for each capability. 3) Conflict on a late CPU - When a CPU is brought up, it is checked against the caps that are known to be enabled on the system (via verify_local_cpu_capabilities()). Based on the state of the capability on the CPU vs. that of System we could have the following combinations of conflict. x-----------------------------x | Type | System | Late CPU | ------------------------------| | a | y | n | ------------------------------| | b | n | y | x-----------------------------x Case (a) is not permitted for caps which are system features, which the system expects all the CPUs to have (e.g VHE). While (a) is ignored for all errata work arounds. However, there could be exceptions to the plain filtering approach. e.g, KPTI is an optional feature for a late CPU as long as the system already enables it. Case (b) is not permitted for errata work arounds which requires some work around, which cannot be delayed. And we ignore (b) for features. Here, yet again, KPTI is an exception, where if a late CPU needs KPTI we are too late to enable it (because we change the allocation of ASIDs etc). So this calls for a lot more fine grained behavior for each capability. And if we define all the attributes to control their behavior properly, we may be able to use a single table for the CPU hwcaps (which cover errata and features, not the ELF HWCAPs). This is a prepartory step to get there. More bits would be added for the properties listed above. We are going to use a bit-mask to encode all the properties of a capabilities. This patch encodes the "SCOPE" of the capability. As such there is no change in how the capabilities are treated. Cc: Mark Rutland <mark.rutland@arm.com> Reviewed-by: NDave Martin <dave.martin@arm.com> Signed-off-by: NSuzuki K Poulose <suzuki.poulose@arm.com> Signed-off-by: NWill Deacon <will.deacon@arm.com>
-
由 Suzuki K Poulose 提交于
We have errata work around processing code in cpu_errata.c, which calls back into helpers defined in cpufeature.c. Now that we are going to make the handling of capabilities generic, by adding the information to each capability, move the errata work around specific processing code. No functional changes. Cc: Will Deacon <will.deacon@arm.com> Cc: Marc Zyngier <marc.zyngier@arm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Andre Przywara <andre.przywara@arm.com> Reviewed-by: NDave Martin <dave.martin@arm.com> Signed-off-by: NSuzuki K Poulose <suzuki.poulose@arm.com> Signed-off-by: NWill Deacon <will.deacon@arm.com>
-
由 Dave Martin 提交于
We issue the enable() call back for all CPU hwcaps capabilities available on the system, on all the CPUs. So far we have ignored the argument passed to the call back, which had a prototype to accept a "void *" for use with on_each_cpu() and later with stop_machine(). However, with commit 0a0d111d ("arm64: cpufeature: Pass capability structure to ->enable callback"), there are some users of the argument who wants the matching capability struct pointer where there are multiple matching criteria for a single capability. Clean up the declaration of the call back to make it clear. 1) Renamed to cpu_enable(), to imply taking necessary actions on the called CPU for the entry. 2) Pass const pointer to the capability, to allow the call back to check the entry. (e.,g to check if any action is needed on the CPU) 3) We don't care about the result of the call back, turning this to a void. Cc: Will Deacon <will.deacon@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Andre Przywara <andre.przywara@arm.com> Cc: James Morse <james.morse@arm.com> Acked-by: NRobin Murphy <robin.murphy@arm.com> Reviewed-by: NJulien Thierry <julien.thierry@arm.com> Signed-off-by: NDave Martin <dave.martin@arm.com> [suzuki: convert more users, rename call back and drop results] Signed-off-by: NSuzuki K Poulose <suzuki.poulose@arm.com> Signed-off-by: NWill Deacon <will.deacon@arm.com>
-
- 20 3月, 2018 3 次提交
-
-
由 Dave Martin 提交于
Currently a SIGFPE delivered in response to a floating-point exception trap may have si_code set to 0 on arm64. As reported by Eric, this is a bad idea since this is the value of SI_USER -- yet this signal is definitely not the result of kill(2), tgkill(2) etc. and si_uid and si_pid make limited sense whereas we do want to yield a value for si_addr (which doesn't exist for SI_USER). It's not entirely clear whether the architecure permits a "spurious" fp exception trap where none of the exception flag bits in ESR_ELx is set. (IMHO the architectural intent is to forbid this.) However, it does permit those bits to contain garbage if the TFV bit in ESR_ELx is 0. That case isn't currently handled at all and may result in si_code == 0 or si_code containing a FPE_FLT* constant corresponding to an exception that did not in fact happen. There is nothing sensible we can return for si_code in such cases, but SI_USER is certainly not appropriate and will lead to violation of legitimate userspace assumptions. This patch allocates a new si_code value FPE_UNKNOWN that at least does not conflict with any existing SI_* or FPE_* code, and yields this in si_code for undiagnosable cases. This is probably the best simplicity/incorrectness tradeoff achieveable without relying on implementation-dependent features or adding a lot of code. In any case, there appears to be no perfect solution possible that would justify a lot of effort here. Yielding FPE_UNKNOWN when some well-defined fp exception caused the trap is a violation of POSIX, but this is forced by the architecture. We have no realistic prospect of yielding the correct code in such cases. At present I am not aware of any ARMv8 implementation that supports trapped floating-point exceptions in any case. The new code may be applicable to other architectures for similar reasons. No attempt is made to provide ESR_ELx to userspace in the signal frame, since architectural limitations mean that it is unlikely to provide much diagnostic value, doesn't benefit existing software and would create ABI with no proven purpose. The existing mechanism for passing it also has problems of its own which may result in the wrong value being passed to userspace due to interaction with mm faults. The implied rework does not appear justified. Acked-by: N"Eric W. Biederman" <ebiederm@xmission.com> Reported-by: N"Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: NDave Martin <Dave.Martin@arm.com> Signed-off-by: NWill Deacon <will.deacon@arm.com>
-
由 Suzuki K Poulose 提交于
Expose the new features introduced by Arm v8.4 extensions to Arm v8-A profile. These include : 1) Data indpendent timing of instructions. (DIT, exposed as HWCAP_DIT) 2) Unaligned atomic instructions and Single-copy atomicity of loads and stores. (AT, expose as HWCAP_USCAT) 3) LDAPR and STLR instructions with immediate offsets (extension to LRCPC, exposed as HWCAP_ILRCPC) 4) Flag manipulation instructions (TS, exposed as HWCAP_FLAGM). Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Mark Rutland <mark.rutland@arm.com> Reviewed-by: NDave Martin <dave.martin@arm.com> Signed-off-by: NSuzuki K Poulose <suzuki.poulose@arm.com> Signed-off-by: NWill Deacon <will.deacon@arm.com>
-
由 Ard Biesheuvel 提交于
Now that we started keeping modules within 4 GB of the core kernel in all cases, we no longer need to special case the adr_l/ldr_l/str_l macros for modules to deal with them being loaded farther away. Signed-off-by: NArd Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: NWill Deacon <will.deacon@arm.com>
-
- 09 3月, 2018 4 次提交
-
-
由 Shanker Donthineni 提交于
The DCache clean & ICache invalidation requirements for instructions to be data coherence are discoverable through new fields in CTR_EL0. The following two control bits DIC and IDC were defined for this purpose. No need to perform point of unification cache maintenance operations from software on systems where CPU caches are transparent. This patch optimize the three functions __flush_cache_user_range(), clean_dcache_area_pou() and invalidate_icache_range() if the hardware reports CTR_EL0.IDC and/or CTR_EL0.IDC. Basically it skips the two instructions 'DC CVAU' and 'IC IVAU', and the associated loop logic in order to avoid the unnecessary overhead. CTR_EL0.DIC: Instruction cache invalidation requirements for instruction to data coherence. The meaning of this bit[29]. 0: Instruction cache invalidation to the point of unification is required for instruction to data coherence. 1: Instruction cache cleaning to the point of unification is not required for instruction to data coherence. CTR_EL0.IDC: Data cache clean requirements for instruction to data coherence. The meaning of this bit[28]. 0: Data cache clean to the point of unification is required for instruction to data coherence, unless CLIDR_EL1.LoC == 0b000 or (CLIDR_EL1.LoUIS == 0b000 && CLIDR_EL1.LoUU == 0b000). 1: Data cache clean to the point of unification is not required for instruction to data coherence. Co-authored-by: NPhilip Elcan <pelcan@codeaurora.org> Reviewed-by: NMark Rutland <mark.rutland@arm.com> Signed-off-by: NShanker Donthineni <shankerd@codeaurora.org> Signed-off-by: NWill Deacon <will.deacon@arm.com>
-
由 Ard Biesheuvel 提交于
Omit patching of ADRP instruction at module load time if the current CPUs are not susceptible to the erratum. Signed-off-by: NArd Biesheuvel <ard.biesheuvel@linaro.org> [will: Drop duplicate initialisation of .def_scope field] Signed-off-by: NWill Deacon <will.deacon@arm.com>
-
由 Ard Biesheuvel 提交于
In some cases, core variants that are affected by a certain erratum also exist in versions that have the erratum fixed, and this fact is recorded in a dedicated bit in system register REVIDR_EL1. Since the architecture does not require that a certain bit retains its meaning across different variants of the same model, each such REVIDR bit is tightly coupled to a certain revision/variant value, and so we need a list of revidr_mask/midr pairs to carry this information. So add the struct member and the associated macros and handling to allow REVIDR fixes to be taken into account. Signed-off-by: NArd Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: NWill Deacon <will.deacon@arm.com>
-
由 Ard Biesheuvel 提交于
Working around Cortex-A53 erratum #843419 involves special handling of ADRP instructions that end up in the last two instruction slots of a 4k page, or whose output register gets overwritten without having been read. (Note that the latter instruction sequence is never emitted by a properly functioning compiler, which is why it is disregarded by the handling of the same erratum in the bfd.ld linker which we rely on for the core kernel) Normally, this gets taken care of by the linker, which can spot such sequences at final link time, and insert a veneer if the ADRP ends up at a vulnerable offset. However, linux kernel modules are partially linked ELF objects, and so there is no 'final link time' other than the runtime loading of the module, at which time all the static relocations are resolved. For this reason, we have implemented the #843419 workaround for modules by avoiding ADRP instructions altogether, by using the large C model, and by passing -mpc-relative-literal-loads to recent versions of GCC that may emit adrp/ldr pairs to perform literal loads. However, this workaround forces us to keep literal data mixed with the instructions in the executable .text segment, and literal data may inadvertently turn into an exploitable speculative gadget depending on the relative offsets of arbitrary symbols. So let's reimplement this workaround in a way that allows us to switch back to the small C model, and to drop the -mpc-relative-literal-loads GCC switch, by patching affected ADRP instructions at runtime: - ADRP instructions that do not appear at 4k relative offset 0xff8 or 0xffc are ignored - ADRP instructions that are within 1 MB of their target symbol are converted into ADR instructions - remaining ADRP instructions are redirected via a veneer that performs the load using an unaffected movn/movk sequence. Signed-off-by: NArd Biesheuvel <ard.biesheuvel@linaro.org> [will: tidied up ADRP -> ADR instruction patching.] [will: use ULL suffix for 64-bit immediate] Signed-off-by: NWill Deacon <will.deacon@arm.com>
-
- 07 3月, 2018 5 次提交
-
-
由 Will Deacon 提交于
TCR_EL1.NFD1 was allocated by SVE and ensures that fault-surpressing SVE memory accesses (e.g. speculative accesses from a first-fault gather load) which translate via TTBR1_EL1 result in a translation fault if they miss in the TLB when executed from EL0. This mitigates some timing attacks against KASLR, where the kernel address space could otherwise be probed efficiently using the FFR in conjunction with suppressed faults on SVE loads. Cc: Dave Martin <Dave.Martin@arm.com> Acked-by: NArd Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: NWill Deacon <will.deacon@arm.com>
-
由 Catalin Marinas 提交于
Commit 97303480 ("arm64: Increase the max granular size") increased the cache line size to 128 to match Cavium ThunderX, apparently for some performance benefit which could not be confirmed. This change, however, has an impact on the network packets allocation in certain circumstances, requiring slightly over a 4K page with a significant performance degradation. This patch reverts L1_CACHE_SHIFT back to 6 (64-byte cache line) while keeping ARCH_DMA_MINALIGN at 128. The cache_line_size() function was changed to default to ARCH_DMA_MINALIGN in the absence of a meaningful CTR_EL0.CWG bit field. In addition, if a system with ARCH_DMA_MINALIGN < CTR_EL0.CWG is detected, the kernel will force swiotlb bounce buffering for all non-coherent devices since DMA cache maintenance on sub-CWG ranges is not safe, leading to data corruption. Cc: Tirumalesh Chalamarla <tchalamarla@cavium.com> Cc: Timur Tabi <timur@codeaurora.org> Cc: Florian Fainelli <f.fainelli@gmail.com> Acked-by: NRobin Murphy <robin.murphy@arm.com> Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com> Signed-off-by: NWill Deacon <will.deacon@arm.com>
-
由 Will Deacon 提交于
show_unhandled_signals_ratelimited is only called in traps.c, so move it out of its macro in the dreaded system_misc.h and into a static function in traps.c Signed-off-by: NWill Deacon <will.deacon@arm.com>
-
由 Will Deacon 提交于
In preparation for consolidating our handling of printing unhandled signals, introduce a wrapper around force_sig_info which can act as the canonical place for dealing with show_unhandled_signals. Initially, we just hook this up to arm64_notify_die. Signed-off-by: NWill Deacon <will.deacon@arm.com>
-
由 Will Deacon 提交于
force_signal_inject is a little flakey: * It only knows about SIGILL and SIGSEGV, so can potentially deliver other signals based on a partially initialised siginfo_t * It sets si_addr to point at the PC for SIGSEGV * It always operates on current, so doesn't need the regs argument This patch fixes these issues by always assigning the si_addr field to the address parameter of the function and updates the callers (including those that indirectly call via arm64_notify_segfault) accordingly. Signed-off-by: NWill Deacon <will.deacon@arm.com>
-
- 23 2月, 2018 1 次提交
-
-
由 Pratyush Anand 提交于
do_task_stat() calls get_wchan(), which further does unwind_frame(). unwind_frame() restores frame->pc to original value in case function graph tracer has modified a return address (LR) in a stack frame to hook a function return. However, if function graph tracer has hit a filtered function, then we can't unwind it as ftrace_push_return_trace() has biased the index(frame->graph) with a 'huge negative' offset(-FTRACE_NOTRACE_DEPTH). Moreover, arm64 stack walker defines index(frame->graph) as unsigned int, which can not compare a -ve number. Similar problem we can have with calling of walk_stackframe() from save_stack_trace_tsk() or dump_backtrace(). This patch fixes unwind_frame() to test the index for -ve value and restore index accordingly before we can restore frame->pc. Reproducer: cd /sys/kernel/debug/tracing/ echo schedule > set_graph_notrace echo 1 > options/display-graph echo wakeup > current_tracer ps -ef | grep -i agent Above commands result in: Unable to handle kernel paging request at virtual address ffff801bd3d1e000 pgd = ffff8003cbe97c00 [ffff801bd3d1e000] *pgd=0000000000000000, *pud=0000000000000000 Internal error: Oops: 96000006 [#1] SMP [...] CPU: 5 PID: 11696 Comm: ps Not tainted 4.11.0+ #33 [...] task: ffff8003c21ba000 task.stack: ffff8003cc6c0000 PC is at unwind_frame+0x12c/0x180 LR is at get_wchan+0xd4/0x134 pc : [<ffff00000808892c>] lr : [<ffff0000080860b8>] pstate: 60000145 sp : ffff8003cc6c3ab0 x29: ffff8003cc6c3ab0 x28: 0000000000000001 x27: 0000000000000026 x26: 0000000000000026 x25: 00000000000012d8 x24: 0000000000000000 x23: ffff8003c1c04000 x22: ffff000008c83000 x21: ffff8003c1c00000 x20: 000000000000000f x19: ffff8003c1bc0000 x18: 0000fffffc593690 x17: 0000000000000000 x16: 0000000000000001 x15: 0000b855670e2b60 x14: 0003e97f22cf1d0f x13: 0000000000000001 x12: 0000000000000000 x11: 00000000e8f4883e x10: 0000000154f47ec8 x9 : 0000000070f367c0 x8 : 0000000000000000 x7 : 00008003f7290000 x6 : 0000000000000018 x5 : 0000000000000000 x4 : ffff8003c1c03cb0 x3 : ffff8003c1c03ca0 x2 : 00000017ffe80000 x1 : ffff8003cc6c3af8 x0 : ffff8003d3e9e000 Process ps (pid: 11696, stack limit = 0xffff8003cc6c0000) Stack: (0xffff8003cc6c3ab0 to 0xffff8003cc6c4000) [...] [<ffff00000808892c>] unwind_frame+0x12c/0x180 [<ffff000008305008>] do_task_stat+0x864/0x870 [<ffff000008305c44>] proc_tgid_stat+0x3c/0x48 [<ffff0000082fde0c>] proc_single_show+0x5c/0xb8 [<ffff0000082b27e0>] seq_read+0x160/0x414 [<ffff000008289e6c>] __vfs_read+0x58/0x164 [<ffff00000828b164>] vfs_read+0x88/0x144 [<ffff00000828c2e8>] SyS_read+0x60/0xc0 [<ffff0000080834a0>] __sys_trace_return+0x0/0x4 Fixes: 20380bb3 (arm64: ftrace: fix a stack tracer's output under function graph tracer) Signed-off-by: NPratyush Anand <panand@redhat.com> Signed-off-by: NJerome Marchand <jmarchan@redhat.com> [catalin.marinas@arm.com: replace WARN_ON with WARN_ON_ONCE] Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
-
- 19 2月, 2018 2 次提交
-
-
由 Robin Murphy 提交于
In converting __range_ok() into a static inline, I inadvertently made it more type-safe, but without considering the ordering of the relevant conversions. This leads to quite a lot of Sparse noise about the fact that we use __chk_user_ptr() after addr has already been converted from a user pointer to an unsigned long. Rather than just adding another cast for the sake of shutting Sparse up, it seems reasonable to rework the types to make logical sense (although the resulting codegen for __range_ok() remains identical). The only callers this affects directly are our compat traps where the inferred "user-pointer-ness" of a register value now warrants explicit casting. Signed-off-by: NRobin Murphy <robin.murphy@arm.com> Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
-
由 Bhupesh Sharma 提交于
Since commit e1a50de3 (arm64: cputype: Silence Sparse warnings), compilation of arm64 architecture is broken with the following error messages: AR arch/arm64/kernel/built-in.o arch/arm64/kernel/head.S: Assembler messages: arch/arm64/kernel/head.S:677: Error: found 'L', expected: ')' arch/arm64/kernel/head.S:677: Error: found 'L', expected: ')' arch/arm64/kernel/head.S:677: Error: found 'L', expected: ')' arch/arm64/kernel/head.S:677: Error: junk at end of line, first unrecognized character is `L' arch/arm64/kernel/head.S:677: Error: unexpected characters following instruction at operand 2 -- `movz x1,:abs_g1_s:0xff00ffffffUL' arch/arm64/kernel/head.S:677: Error: unexpected characters following instruction at operand 2 -- `movk x1,:abs_g0_nc:0xff00ffffffUL' This patch fixes the same by using the UL() macro correctly for assigning the MPIDR_HWID_BITMASK macro value. Fixes: e1a50de3 ("arm64: cputype: Silence Sparse warnings") Acked-by: NArnd Bergmann <arnd@arndb.de> Acked-by: NRobin Murphy <robin.murphy@arm.com> Signed-off-by: NBhupesh Sharma <bhsharma@redhat.com> Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
-
- 17 2月, 2018 2 次提交
-
-
由 Robin Murphy 提交于
Sparse makes a fair bit of noise about our MPIDR mask being implicitly long - let's explicitly describe it as such rather than just relying on the value forcing automatic promotion. Signed-off-by: NRobin Murphy <robin.murphy@arm.com> Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
-
由 Will Deacon 提交于
In many cases, page tables can be accessed concurrently by either another CPU (due to things like fast gup) or by the hardware page table walker itself, which may set access/dirty bits. In such cases, it is important to use READ_ONCE/WRITE_ONCE when accessing page table entries so that entries cannot be torn, merged or subject to apparent loss of coherence due to compiler transformations. Whilst there are some scenarios where this cannot happen (e.g. pinned kernel mappings for the linear region), the overhead of using READ_ONCE /WRITE_ONCE everywhere is minimal and makes the code an awful lot easier to reason about. This patch consistently uses these macros in the arch code, as well as explicitly namespacing pointers to page table entries from the entries themselves by using adopting a 'p' suffix for the former (as is sometimes used elsewhere in the kernel source). Tested-by: NYury Norov <ynorov@caviumnetworks.com> Tested-by: NRichard Ruigrok <rruigrok@codeaurora.org> Reviewed-by: NMarc Zyngier <marc.zyngier@arm.com> Signed-off-by: NWill Deacon <will.deacon@arm.com> Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
-