提交 · 62d18ecfa64137349fac9c5817784fbd48b54f48 · openeuler / Kernel

23 5月, 2018 1 次提交

arm64: fault: Don't leak data in ESR context for user fault on kernel VA · cc198460

由 Peter Maydell 提交于 5月 22, 2018

If userspace faults on a kernel address, handing them the raw ESR
value on the sigframe as part of the delivered signal can leak data
useful to attackers who are using information about the underlying hardware
fault type (e.g. translation vs permission) as a mechanism to defeat KASLR.

However there are also legitimate uses for the information provided
in the ESR -- notably the GCC and LLVM sanitizers use this to report
whether wild pointer accesses by the application are reads or writes
(since a wild write is a more serious bug than a wild read), so we
don't want to drop the ESR information entirely.

For faulting addresses in the kernel, sanitize the ESR. We choose
to present userspace with the illusion that there is nothing mapped
in the kernel's part of the address space at all, by reporting all
faults as level 0 translation faults taken to EL1.

These fields are safe to pass through to userspace as they depend
only on the instruction that userspace used to provoke the fault:
 EC IL (always)
 ISV CM WNR (for all data aborts)
All the other fields in ESR except DFSC are architecturally RES0
for an L0 translation fault taken to EL1, so can be zeroed out
without confusing userspace.

The illusion is not entirely perfect, as there is a tiny wrinkle
where we will report an alignment fault that was not due to the memory
type (for instance a LDREX to an unaligned address) as a translation
fault, whereas if you do this on real unmapped memory the alignment
fault takes precedence. This is not likely to trip anybody up in
practice, as the only users we know of for the ESR information who
care about the behaviour for kernel addresses only really want to
know about the WnR bit.
Signed-off-by: NPeter Maydell <peter.maydell@linaro.org>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

cc198460

27 3月, 2018 1 次提交

arm64: capabilities: Update prototype for enable call back · c0cda3b8

由 Dave Martin 提交于 3月 26, 2018

We issue the enable() call back for all CPU hwcaps capabilities
available on the system, on all the CPUs. So far we have ignored
the argument passed to the call back, which had a prototype to
accept a "void *" for use with on_each_cpu() and later with
stop_machine(). However, with commit 0a0d111d
("arm64: cpufeature: Pass capability structure to ->enable callback"),
there are some users of the argument who wants the matching capability
struct pointer where there are multiple matching criteria for a single
capability. Clean up the declaration of the call back to make it clear.

 1) Renamed to cpu_enable(), to imply taking necessary actions on the
    called CPU for the entry.
 2) Pass const pointer to the capability, to allow the call back to
    check the entry. (e.,g to check if any action is needed on the CPU)
 3) We don't care about the result of the call back, turning this to
    a void.

Cc: Will Deacon <will.deacon@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Andre Przywara <andre.przywara@arm.com>
Cc: James Morse <james.morse@arm.com>
Acked-by: NRobin Murphy <robin.murphy@arm.com>
Reviewed-by: NJulien Thierry <julien.thierry@arm.com>
Signed-off-by: NDave Martin <dave.martin@arm.com>
[suzuki: convert more users, rename call back and drop results]
Signed-off-by: NSuzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

c0cda3b8

09 3月, 2018 1 次提交

arm64: signal: Ensure si_code is valid for all fault signals · af40ff68

由 Dave Martin 提交于 3月 08, 2018

Currently, as reported by Eric, an invalid si_code value 0 is
passed in many signals delivered to userspace in response to faults
and other kernel errors.  Typically 0 is passed when the fault is
insufficiently diagnosable or when there does not appear to be any
sensible alternative value to choose.

This appears to violate POSIX, and is intuitively wrong for at
least two reasons arising from the fact that 0 == SI_USER:

 1) si_code is a union selector, and SI_USER (and si_code <= 0 in
    general) implies the existence of a different set of fields
    (siginfo._kill) from that which exists for a fault signal
    (siginfo._sigfault).  However, the code raising the signal
    typically writes only the _sigfault fields, and the _kill
    fields make no sense in this case.

    Thus when userspace sees si_code == 0 (SI_USER) it may
    legitimately inspect fields in the inactive union member _kill
    and obtain garbage as a result.

    There appears to be software in the wild relying on this,
    albeit generally only for printing diagnostic messages.

 2) Software that wants to be robust against spurious signals may
    discard signals where si_code == SI_USER (or <= 0), or may
    filter such signals based on the si_uid and si_pid fields of
    siginfo._sigkill.  In the case of fault signals, this means
    that important (and usually fatal) error conditions may be
    silently ignored.

In practice, many of the faults for which arm64 passes si_code == 0
are undiagnosable conditions such as exceptions with syndrome
values in ESR_ELx to which the architecture does not yet assign any
meaning, or conditions indicative of a bug or error in the kernel
or system and thus that are unrecoverable and should never occur in
normal operation.

The approach taken in this patch is to translate all such
undiagnosable or "impossible" synchronous fault conditions to
SIGKILL, since these are at least probably localisable to a single
process.  Some of these conditions should really result in a kernel
panic, but due to the lack of diagnostic information it is
difficult to be certain: this patch does not add any calls to
panic(), but this could change later if justified.

Although si_code will not reach userspace in the case of SIGKILL,
it is still desirable to pass a nonzero value so that the common
siginfo handling code can detect incorrect use of si_code == 0
without false positives.  In this case the si_code dependent
siginfo fields will not be correctly initialised, but since they
are not passed to userspace I deem this not to matter.

A few faults can reasonably occur in realistic userspace scenarios,
and _should_ raise a regular, handleable (but perhaps not
ignorable/blockable) signal: for these, this patch attempts to
choose a suitable standard si_code value for the raised signal in
each case instead of 0.

arm64 was the only arch to define a BUS_FIXME code, so after this
patch nobody defines it.  This patch therefore also removes the
relevant code from siginfo_layout().

Cc: James Morse <james.morse@arm.com>
Reported-by: NEric W. Biederman <ebiederm@xmission.com>
Signed-off-by: NDave Martin <Dave.Martin@arm.com>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

af40ff68

07 3月, 2018 2 次提交

arm64: mm: Rework unhandled user pagefaults to call arm64_force_sig_info · 92ff0674

由 Will Deacon 提交于 2月 20, 2018

Reporting unhandled user pagefaults via arm64_force_sig_info means
that __do_user_fault can be drastically simplified, since it no longer
has to worry about printing the fault information and can consequently
just take the siginfo as a parameter.
Signed-off-by: NWill Deacon <will.deacon@arm.com>

92ff0674

arm64: Pass user fault info to arm64_notify_die instead of printing it · 1049c308

由 Will Deacon 提交于 2月 20, 2018

There's no need for callers of arm64_notify_die to print information
about user faults. Instead, they can pass a string to arm64_notify_die
which will be printed subject to show_unhandled_signals.
Signed-off-by: NWill Deacon <will.deacon@arm.com>

1049c308

17 2月, 2018 1 次提交

arm64: mm: Use READ_ONCE/WRITE_ONCE when accessing page tables · 20a004e7

由 Will Deacon 提交于 2月 15, 2018

In many cases, page tables can be accessed concurrently by either another
CPU (due to things like fast gup) or by the hardware page table walker
itself, which may set access/dirty bits. In such cases, it is important
to use READ_ONCE/WRITE_ONCE when accessing page table entries so that
entries cannot be torn, merged or subject to apparent loss of coherence
due to compiler transformations.

Whilst there are some scenarios where this cannot happen (e.g. pinned
kernel mappings for the linear region), the overhead of using READ_ONCE
/WRITE_ONCE everywhere is minimal and makes the code an awful lot easier
to reason about. This patch consistently uses these macros in the arch
code, as well as explicitly namespacing pointers to page table entries
from the entries themselves by using adopting a 'p' suffix for the former
(as is sometimes used elsewhere in the kernel source).
Tested-by: NYury Norov <ynorov@caviumnetworks.com>
Tested-by: NRichard Ruigrok <rruigrok@codeaurora.org>
Reviewed-by: NMarc Zyngier <marc.zyngier@arm.com>
Signed-off-by: NWill Deacon <will.deacon@arm.com>
Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>

20a004e7

07 2月, 2018 3 次提交

arm64: entry: Apply BP hardening for suspicious interrupts from EL0 · 30d88c0e

由 Will Deacon 提交于 2月 02, 2018

It is possible to take an IRQ from EL0 following a branch to a kernel
address in such a way that the IRQ is prioritised over the instruction
abort. Whilst an attacker would need to get the stars to align here,
it might be sufficient with enough calibration so perform BP hardening
in the rare case that we see a kernel address in the ELR when handling
an IRQ from EL0.
Reported-by: NDan Hettena <dhettena@nvidia.com>
Reviewed-by: NMarc Zyngier <marc.zyngier@arm.com>
Signed-off-by: NWill Deacon <will.deacon@arm.com>
Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>

30d88c0e

arm64: entry: Apply BP hardening for high-priority synchronous exceptions · 5dfc6ed2

由 Will Deacon 提交于 2月 02, 2018

Software-step and PC alignment fault exceptions have higher priority than
instruction abort exceptions, so apply the BP hardening hooks there too
if the user PC appears to reside in kernel space.
Reported-by: NDan Hettena <dhettena@nvidia.com>
Reviewed-by: NMarc Zyngier <marc.zyngier@arm.com>
Signed-off-by: NWill Deacon <will.deacon@arm.com>
Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>

5dfc6ed2

arm64: Make USER_DS an inclusive limit · 51369e39

由 Robin Murphy 提交于 2月 05, 2018

Currently, USER_DS represents an exclusive limit while KERNEL_DS is
inclusive. In order to do some clever trickery for speculation-safe
masking, we need them both to behave equivalently - there aren't enough
bits to make KERNEL_DS exclusive, so we have precisely one option. This
also happens to correct a longstanding false negative for a range
ending on the very top byte of kernel memory.

Mark Rutland points out that we've actually got the semantics of
addresses vs. segments muddled up in most of the places we need to
amend, so shuffle the {USER,KERNEL}_DS definitions around such that we
can correct those properly instead of just pasting "-1"s everywhere.
Signed-off-by: NRobin Murphy <robin.murphy@arm.com>
Signed-off-by: NWill Deacon <will.deacon@arm.com>
Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>

51369e39

13 1月, 2018 1 次提交

signal/arm64: Document conflicts with SI_USER and SIGFPE,SIGTRAP,SIGBUS · 526c3ddb

由 Eric W. Biederman 提交于 1月 03, 2018

Setting si_code to 0 results in a userspace seeing an si_code of 0.
This is the same si_code as SI_USER.  Posix and common sense requires
that SI_USER not be a signal specific si_code.  As such this use of 0
for the si_code is a pretty horribly broken ABI.

Further use of si_code == 0 guaranteed that copy_siginfo_to_user saw a
value of __SI_KILL and now sees a value of SIL_KILL with the result
that uid and pid fields are copied and which might copying the si_addr
field by accident but certainly not by design.  Making this a very
flakey implementation.

Utilizing FPE_FIXME, BUS_FIXME, TRAP_FIXME siginfo_layout will now return
SIL_FAULT and the appropriate fields will be reliably copied.

But folks this is a new and unique kind of bad.  This is massively
untested code bad.  This is inventing new and unique was to get
siginfo wrong bad.  This is don't even think about Posix or what
siginfo means bad.  This is lots of eyeballs all missing the fact
that the code does the wrong thing bad.  This is getting stuck
and keep making the same mistake bad.

I really hope we can find a non userspace breaking fix for this on a
port as new as arm64.

Possible ABI fixes include:
- Send the signal without siginfo
- Don't generate a signal
- Possibly assign and use an appropriate si_code
- Don't handle cases which can't happen

Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Tyler Baicar <tbaicar@codeaurora.org>
Cc: James Morse <james.morse@arm.com>
Cc: Tony Lindgren <tony@atomide.com>
Cc: Nicolas Pitre <nico@linaro.org>
Cc: Olof Johansson <olof@lixom.net>
Cc: Santosh Shilimkar <santosh.shilimkar@ti.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: linux-arm-kernel@lists.infradead.org
Ref: 53631b54 ("arm64: Floating point and SIMD")
Ref: 32015c23 ("arm64: exception: handle Synchronous External Abort")
Ref: 1d18c47c ("arm64: MMU fault handling and page table management")
Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>

526c3ddb

09 1月, 2018 1 次提交

arm64: Add skeleton to harden the branch predictor against aliasing attacks · 0f15adbb

由 Will Deacon 提交于 1月 03, 2018

Aliasing attacks against CPU branch predictors can allow an attacker to
redirect speculative control flow on some CPUs and potentially divulge
information from one context to another.

This patch adds initial skeleton code behind a new Kconfig option to
enable implementation-specific mitigations against these attacks for
CPUs that are affected.
Co-developed-by: NMarc Zyngier <marc.zyngier@arm.com>
Signed-off-by: NWill Deacon <will.deacon@arm.com>
Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>

0f15adbb

13 12月, 2017 1 次提交

arm64: fault: avoid send SIGBUS two times · faa75e14

由 Dongjiu Geng 提交于 12月 13, 2017

do_sea() calls arm64_notify_die() which will always signal
user-space. It also returns whether APEI claimed the external
abort as a RAS notification. If it returns failure do_mem_abort()
will signal user-space too.

do_mem_abort() wants to know if we handled the error, we always
call arm64_notify_die() so can always return success.
Signed-off-by: NDongjiu Geng <gengdongjiu@huawei.com>
Reviewed-by: NJames Morse <james.morse@arm.com>
Reviewed-by: NXie XiuQi <xiexiuqi@huawei.com>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

faa75e14

02 11月, 2017 1 次提交

arm64: Don't walk page table for user faults in do_mem_abort · 80b6eb04

由 Will Deacon 提交于 10月 31, 2017

Commit 42dbf54e ("arm64: consistently log ESR and page table")
dumps page table entries for user faults hitting do_bad entries in the
fault handler table. Whilst this shouldn't really happen in practice,
it's not beyond the realms of possibility if e.g. running an old kernel
on a new CPU.

Generally, we want to avoid exposing physical addresses under the control
of userspace (see commit bf396c09 ("arm64: mm: don't print out page
table entries on EL0 faults")), so walk the page tables only on exceptions
from EL1.
Reported-by: NKristina Martsenko <kristina.martsenko@arm.com>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

80b6eb04

27 10月, 2017 1 次提交

arm64: consistently log ESR and page table · 42dbf54e

由 Mark Rutland 提交于 10月 19, 2017

When we take a fault we can't handle, we try to dump some relevant
information, but we're not consistent about doing so.

In do_mem_abort(), we log the full ESR, but don't dump a page table
walk. In __do_kernel_fault, we dump an attempted decoding of the ESR
(but not the ESR itself) along with a page table walk.

Let's try to make things more consistent by dumping the full ESR in
mem_abort_decode(), and having do_mem_abort dump a page table walk. The
existing dump of the ESR in do_mem_abort() is rendered redundant, and
removed.
Tested-by: NLaura Abbott <labbott@redhat.com>
Signed-off-by: NMark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Julien Thierry <julien.thierry@arm.com>
Cc: Kristina Martsenko <kristina.martsenko@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

42dbf54e

19 10月, 2017 1 次提交

arm64: Update fault_info table with new exception types · 3f7c86b2

由 Julien Thierry 提交于 10月 17, 2017

Based on: ARM Architecture Reference Manual, ARMv8 (DDI 0487B.b).

ARMv8.1 introduces the optional feature ARMv8.1-TTHM which can trigger a
new type of memory abort. This exception is triggered when hardware update
of page table flags is not atomic in regards to other memory accesses.
Replace the corresponding unknown entry with a more accurate one.

Cf: Section D10.2.28 ESR_ELx, Exception Syndrome Register (p D10-2381),
section D4.4.11 Restriction on memory types for hardware updates on page
tables (p D4-2116 - D4-2117).

ARMv8.2 does not add new exception types, however it is worth mentioning
that when obligatory feature RAS (optional for ARMv8.{0,1}) is implemented,
exceptions related to "Synchronous parity or ECC error on memory access,
not on translation table walk" become reserved and should not occur.
Signed-off-by: NJulien Thierry <julien.thierry@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

3f7c86b2

02 10月, 2017 2 次提交

arm64: fix misleading data abort decoding · 0a6de8b8

由 Mark Rutland 提交于 10月 02, 2017

Currently data_abort_decode() dumps the ISS field as a decimal value
with a '0x' prefix, which is somewhat misleading.

Fix it to print as hexadecimal, as was intended.

Fixes: 1f9b8936 ("arm64: Decode information from ESR upon mem faults")
Reviewed-by: NDave Martin <Dave.Martin@arm.com>
Reviewed-by: NJulien Thierry <julien.thierry@arm.com>
Acked-by: NWill Deacon <will.deacon@arm.com>
Signed-off-by: NMark Rutland <mark.rutland@arm.com>
Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>

0a6de8b8

arm64: mm: Remove useless and wrong comments from fault.c · f67d5c4f

由 Will Deacon 提交于 9月 22, 2017

Fault.c seems to be a magnet for useless and wrong comments, largely
due to its ancestry in other architectures where the code has since
moved on, but the comments have remained intact.

This patch removes both useless and incorrect comments, leaving only
those that say something correct and relevant.
Reported-by: NWenjia Zhou <zhiyuan_zhu@htc.com>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

f67d5c4f

29 9月, 2017 1 次提交

arm64: fault: Route pte translation faults via do_translation_fault · 760bfb47

由 Will Deacon 提交于 9月 29, 2017

We currently route pte translation faults via do_page_fault, which elides
the address check against TASK_SIZE before invoking the mm fault handling
code. However, this can cause issues with the path walking code in
conjunction with our word-at-a-time implementation because
load_unaligned_zeropad can end up faulting in kernel space if it reads
across a page boundary and runs into a page fault (e.g. by attempting to
read from a guard region).

In the case of such a fault, load_unaligned_zeropad has registered a
fixup to shift the valid data and pad with zeroes, however the abort is
reported as a level 3 translation fault and we dispatch it straight to
do_page_fault, despite it being a kernel address. This results in calling
a sleeping function from atomic context:

  BUG: sleeping function called from invalid context at arch/arm64/mm/fault.c:313
  in_atomic(): 0, irqs_disabled(): 0, pid: 10290
  Internal error: Oops - BUG: 0 [#1] PREEMPT SMP
  [...]
  [<ffffff8e016cd0cc>] ___might_sleep+0x134/0x144
  [<ffffff8e016cd158>] __might_sleep+0x7c/0x8c
  [<ffffff8e016977f0>] do_page_fault+0x140/0x330
  [<ffffff8e01681328>] do_mem_abort+0x54/0xb0
  Exception stack(0xfffffffb20247a70 to 0xfffffffb20247ba0)
  [...]
  [<ffffff8e016844fc>] el1_da+0x18/0x78
  [<ffffff8e017f399c>] path_parentat+0x44/0x88
  [<ffffff8e017f4c9c>] filename_parentat+0x5c/0xd8
  [<ffffff8e017f5044>] filename_create+0x4c/0x128
  [<ffffff8e017f59e4>] SyS_mkdirat+0x50/0xc8
  [<ffffff8e01684e30>] el0_svc_naked+0x24/0x28
  Code: 36380080 d5384100 f9400800 9402566d (d4210000)
  ---[ end trace 2d01889f2bca9b9f ]---

Fix this by dispatching all translation faults to do_translation_faults,
which avoids invoking the page fault logic for faults on kernel addresses.

Cc: <stable@vger.kernel.org>
Reported-by: NAnkit Jain <ankijain@codeaurora.org>
Signed-off-by: NWill Deacon <will.deacon@arm.com>
Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>

760bfb47

23 8月, 2017 1 次提交

arm64: mm: abort uaccess retries upon fatal signal · 289d07a2

由 Mark Rutland 提交于 7月 11, 2017

When there's a fatal signal pending, arm64's do_page_fault()
implementation returns 0. The intent is that we'll return to the
faulting userspace instruction, delivering the signal on the way.

However, if we take a fatal signal during fixing up a uaccess, this
results in a return to the faulting kernel instruction, which will be
instantly retried, resulting in the same fault being taken forever. As
the task never reaches userspace, the signal is not delivered, and the
task is left unkillable. While the task is stuck in this state, it can
inhibit the forward progress of the system.

To avoid this, we must ensure that when a fatal signal is pending, we
apply any necessary fixup for a faulting kernel instruction. Thus we
will return to an error path, and it is up to that code to make forward
progress towards delivering the fatal signal.

Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Laura Abbott <labbott@redhat.com>
Cc: stable@vger.kernel.org
Reviewed-by: NSteve Capper <steve.capper@arm.com>
Tested-by: NSteve Capper <steve.capper@arm.com>
Reviewed-by: NJames Morse <james.morse@arm.com>
Tested-by: NJames Morse <james.morse@arm.com>
Signed-off-by: NMark Rutland <mark.rutland@arm.com>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

289d07a2

21 8月, 2017 3 次提交

arm64: Remove the !CONFIG_ARM64_HW_AFDBM alternative code paths · af29678f

由 Catalin Marinas 提交于 7月 06, 2017

Since the pte handling for hardware AF/DBM works even when the hardware
feature is not present, make the pte accessors implementation permanent
and remove the corresponding #ifdefs. The Kconfig option is kept as it
can still be used to disable the feature at the hardware level.
Reviewed-by: NWill Deacon <will.deacon@arm.com>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>

af29678f

arm64: Move PTE_RDONLY bit handling out of set_pte_at() · 73e86cb0

由 Catalin Marinas 提交于 7月 04, 2017

Currently PTE_RDONLY is treated as a hardware only bit and not handled
by the pte_mkwrite(), pte_wrprotect() or the user PAGE_* definitions.
The set_pte_at() function is responsible for setting this bit based on
the write permission or dirty state. This patch moves the PTE_RDONLY
handling out of set_pte_at into the pte_mkwrite()/pte_wrprotect()
functions. The PAGE_* definitions to need to be updated to explicitly
include PTE_RDONLY when !PTE_WRITE.

The patch also removes the redundant PAGE_COPY(_EXEC) definitions as
they are identical to the corresponding PAGE_READONLY(_EXEC).
Reviewed-by: NWill Deacon <will.deacon@arm.com>
Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>

73e86cb0

arm64: Convert pte handling from inline asm to using (cmp)xchg · 3bbf7157

由 Catalin Marinas 提交于 6月 26, 2017

With the support for hardware updates of the access and dirty states,
the following pte handling functions had to be implemented using
exclusives: __ptep_test_and_clear_young(), ptep_get_and_clear(),
ptep_set_wrprotect() and ptep_set_access_flags(). To take advantage of
the LSE atomic instructions and also make the code cleaner, convert
these pte functions to use the more generic cmpxchg()/xchg().
Reviewed-by: NWill Deacon <will.deacon@arm.com>
Acked-by: NMark Rutland <mark.rutland@arm.com>
Acked-by: NSteve Capper <steve.capper@arm.com>
Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>

3bbf7157

07 8月, 2017 1 次提交

arm64: Decode information from ESR upon mem faults · 1f9b8936

由 Julien Thierry 提交于 8月 04, 2017

When receiving unhandled faults from the CPU, description is very sparse.
Adding information about faults decoded from ESR.

Added defines to esr.h corresponding ESR fields. Values are based on ARM
Archtecture Reference Manual (DDI 0487B.a), section D7.2.28 ESR_ELx, Exception
Syndrome Register (ELx) (pages D7-2275 to D7-2280).

New output is of the form:
[   77.818059] Mem abort info:
[   77.820826]   Exception class = DABT (current EL), IL = 32 bits
[   77.826706]   SET = 0, FnV = 0
[   77.829742]   EA = 0, S1PTW = 0
[   77.832849] Data abort info:
[   77.835713]   ISV = 0, ISS = 0x00000070
[   77.839522]   CM = 0, WnR = 1
Signed-off-by: NJulien Thierry <julien.thierry@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
[catalin.marinas@arm.com: fix "%lu" in a pr_alert() call]
Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>

1f9b8936

04 8月, 2017 1 次提交

arm64: Fix potential race with hardware DBM in ptep_set_access_flags() · 6d332747

由 Catalin Marinas 提交于 7月 25, 2017

In a system with DBM (dirty bit management) capable agents there is a
possible race between a CPU executing ptep_set_access_flags() (maybe
non-DBM capable) and a hardware update of the dirty state (clearing of
PTE_RDONLY). The scenario:

a) the pte is writable (PTE_WRITE set), clean (PTE_RDONLY set) and old
   (PTE_AF clear)
b) ptep_set_access_flags() is called as a result of a read access and it
   needs to set the pte to writable, clean and young (PTE_AF set)
c) a DBM-capable agent, as a result of a different write access, is
   marking the entry as young (setting PTE_AF) and dirty (clearing
   PTE_RDONLY)

The current ptep_set_access_flags() implementation would set the
PTE_RDONLY bit in the resulting value overriding the DBM update and
losing the dirty state.

This patch fixes such race by setting PTE_RDONLY to the most permissive
(lowest value) of the current entry and the new one.

Fixes: 66dbd6e6 ("arm64: Implement ptep_set_access_flags() for hardware AF/DBM")
Cc: Will Deacon <will.deacon@arm.com>
Acked-by: NMark Rutland <mark.rutland@arm.com>
Acked-by: NSteve Capper <steve.capper@arm.com>
Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

6d332747

23 6月, 2017 3 次提交

arm/arm64: KVM: add guest SEA support · 621f48e4

由 Tyler Baicar 提交于 6月 21, 2017

Currently external aborts are unsupported by the guest abort
handling. Add handling for SEAs so that the host kernel reports
SEAs which occur in the guest kernel.

When an SEA occurs in the guest kernel, the guest exits and is
routed to kvm_handle_guest_abort(). Prior to this patch, a print
message of an unsupported FSC would be printed and nothing else
would happen. With this patch, the code gets routed to the APEI
handling of SEAs in the host kernel to report the SEA information.
Signed-off-by: NTyler Baicar <tbaicar@codeaurora.org>
Acked-by: NCatalin Marinas <catalin.marinas@arm.com>
Acked-by: NMarc Zyngier <marc.zyngier@arm.com>
Acked-by: NChristoffer Dall <cdall@linaro.org>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

621f48e4

acpi: apei: handle SEA notification type for ARMv8 · 7edda088

由 Tyler Baicar 提交于 6月 21, 2017

ARM APEI extension proposal added SEA (Synchronous External Abort)
notification type for ARMv8.
Add a new GHES error source handling function for SEA. If an error
source's notification type is SEA, then this function can be registered
into the SEA exception handler. That way GHES will parse and report
SEA exceptions when they occur.
An SEA can interrupt code that had interrupts masked and is treated as
an NMI. To aid this the page of address space for mapping APEI buffers
while in_nmi() is always reserved, and ghes_ioremap_pfn_nmi() is
changed to use the helper methods to find the prot_t to map with in
the same way as ghes_ioremap_pfn_irq().
Signed-off-by: NTyler Baicar <tbaicar@codeaurora.org>
CC: Jonathan (Zhixiong) Zhang <zjzhang@codeaurora.org>
Reviewed-by: NJames Morse <james.morse@arm.com>
Acked-by: NCatalin Marinas <catalin.marinas@arm.com>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

7edda088

arm64: exception: handle Synchronous External Abort · 32015c23

由 Tyler Baicar 提交于 6月 21, 2017

SEA exceptions are often caused by an uncorrected hardware
error, and are handled when data abort and instruction abort
exception classes have specific values for their Fault Status
Code.
When SEA occurs, before killing the process, report the error
in the kernel logs.
Update fault_info[] with specific SEA faults so that the
new SEA handler is used.
Signed-off-by: NTyler Baicar <tbaicar@codeaurora.org>
CC: Jonathan (Zhixiong) Zhang <zjzhang@codeaurora.org>
Reviewed-by: NJames Morse <james.morse@arm.com>
Acked-by: NCatalin Marinas <catalin.marinas@arm.com>
[will: use NULL instead of 0 when assigning si_addr]
Signed-off-by: NWill Deacon <will.deacon@arm.com>

32015c23

12 6月, 2017 6 次提交

arm64: mm: Update perf accounting to handle poison faults · 0e3a9026

由 Punit Agrawal 提交于 6月 08, 2017

Re-organise the perf accounting for fault handling in preparation for
enabling handling of hardware poison faults in subsequent commits. The
change updates perf accounting to be inline with the behaviour on
x86.

With this update, the perf fault accounting -

  * Always report PERF_COUNT_SW_PAGE_FAULTS

  * Doesn't report anything else for VM_FAULT_ERROR (which includes
    hwpoison faults)

  * Reports PERF_COUNT_SW_PAGE_FAULTS_MAJ if it's a major
    fault (indicated by VM_FAULT_MAJOR)

  * Otherwise, reports PERF_COUNT_SW_PAGE_FAULTS_MIN
Signed-off-by: NPunit Agrawal <punit.agrawal@arm.com>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

0e3a9026

arm64: hwpoison: add VM_FAULT_HWPOISON[_LARGE] handling · e7c600f1

由 Jonathan (Zhixiong) Zhang 提交于 6月 08, 2017

Add VM_FAULT_HWPOISON[_LARGE] handling to the arm64 page fault
handler. Handling of VM_FAULT_HWPOISON[_LARGE] is very similar
to VM_FAULT_OOM, the only difference is that a different si_code
(BUS_MCEERR_AR) is passed to user space and si_addr_lsb field is
initialized.
Signed-off-by: NJonathan (Zhixiong) Zhang <zjzhang@codeaurora.org>
Signed-off-by: NTyler Baicar <tbaicar@codeaurora.org>
(fix new __do_user_fault call-site)
Signed-off-by: NPunit Agrawal <punit.agrawal@arm.com>
Acked-by: NSteve Capper <steve.capper@arm.com>
Tested-by: NManoj Iyer <manoj.iyer@canonical.com>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

e7c600f1

arm64: fault: Print info about page table structure when dumping pte · 1eb34b6e

由 Will Deacon 提交于 5月 15, 2017

Whilst debugging a remote crash, I noticed that show_pte is unhelpful
when it comes to describing the structure of the page table being walked.
This is easily fixed by printing out the page table (swapper vs user),
page size and virtual address size when displaying the PGD address.
Acked-by: NMark Rutland <mark.rutland@arm.com>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

1eb34b6e

arm64: mm: print file name of faulting vma · 83016b20

由 Kristina Martsenko 提交于 6月 09, 2017

Print out the name of the file associated with the vma that faulted.
This is usually the executable or shared library name. We already print
out the task name, but also printing the library name is useful for
pinpointing bugs to libraries.

Also print the base address and size of the vma, which together with the
PC (printed by __show_regs) gives the offset into the library.

Fault prints now look like:
test[2361]: unhandled level 2 translation fault (11) at 0x00000012, esr 0x92000006, in libfoo.so[ffffa0145000+1000]

This is already done on x86, for more details see commit 03252919
("x86: print which shared library/executable faulted in segfault etc.
messages v3").
Acked-by: NMark Rutland <mark.rutland@arm.com>
Signed-off-by: NKristina Martsenko <kristina.martsenko@arm.com>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

83016b20

arm64: mm: don't print out page table entries on EL0 faults · bf396c09

由 Kristina Martsenko 提交于 6月 09, 2017

When we take a fault from EL0 that can't be handled, we print out the
page table entries associated with the faulting address. This allows
userspace to print out any current page table entries, including kernel
(TTBR1) entries. Exposing kernel mappings like this could pose a
security risk, so don't print out page table information on EL0 faults.
(But still print it out for EL1 faults.) This also follows the same
behaviour as x86, printing out page table entries on kernel mode faults
but not user mode faults.
Acked-by: NMark Rutland <mark.rutland@arm.com>
Signed-off-by: NKristina Martsenko <kristina.martsenko@arm.com>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

bf396c09

arm64: mm: print out correct page table entries · 67ce16ec

由 Kristina Martsenko 提交于 6月 09, 2017

When we take a fault that can't be handled, we print out the page table
entries associated with the faulting address. In some cases we currently
print out the wrong entries. For a faulting TTBR1 address, we sometimes
print out TTBR0 table entries instead, and for a faulting TTBR0 address
we sometimes print out TTBR1 table entries. Fix this by choosing the
tables based on the faulting address.
Acked-by: NMark Rutland <mark.rutland@arm.com>
Signed-off-by: NKristina Martsenko <kristina.martsenko@arm.com>
[will: zero-extend addrs to 64-bit, don't walk swapper w/ TTBR0 addr]
Signed-off-by: NWill Deacon <will.deacon@arm.com>

67ce16ec

30 5月, 2017 1 次提交

arm64: Call __show_regs directly · c07ab957

由 Kefeng Wang 提交于 5月 09, 2017

Generic code expects show_regs() to also dump the stack, but arm64's
show_reg() does not do this. Some arm64 callers of show_regs() *only*
want the registers dumped, without the stack.

To enable generic code to work as expected, we need to make
show_regs() dump the stack. Where we only want the registers dumped,
we must use __show_regs().

This patch updates code to use __show_regs() where only registers are
desired. A subsequent patch will modify show_regs().
Acked-by: NMark Rutland <mark.rutland@arm.com>
Signed-off-by: NKefeng Wang <wangkefeng.wang@huawei.com>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

c07ab957

07 4月, 2017 1 次提交

arm64: print a fault message when attempting to write RO memory · b824b930

由 Stephen Boyd 提交于 4月 05, 2017

If a page is marked read only we should print out that fact,
instead of printing out that there was a page fault. Right now we
get a cryptic error message that something went wrong with an
unhandled fault, but we don't evaluate the esr to figure out that
it was a read/write permission fault.

Instead of seeing:

  Unable to handle kernel paging request at virtual address ffff000008e460d8
  pgd = ffff800003504000
  [ffff000008e460d8] *pgd=0000000083473003, *pud=0000000083503003, *pmd=0000000000000000
  Internal error: Oops: 9600004f [#1] PREEMPT SMP

we'll see:

  Unable to handle kernel write to read-only memory at virtual address ffff000008e760d8
  pgd = ffff80003d3de000
  [ffff000008e760d8] *pgd=0000000083472003, *pud=0000000083435003, *pmd=0000000000000000
  Internal error: Oops: 9600004f [#1] PREEMPT SMP

We also add a userspace address check into is_permission_fault()
so that the function doesn't return true for ttbr0 PAN faults
when it shouldn't.
Reviewed-by: NJames Morse <james.morse@arm.com>
Tested-by: NJames Morse <james.morse@arm.com>
Acked-by: NLaura Abbott <labbott@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: NStephen Boyd <stephen.boyd@linaro.org>
Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>

b824b930

04 4月, 2017 1 次提交

arm64: mm: unaligned access by user-land should be received as SIGBUS · 09a6adf5

由 Victor Kamensky 提交于 4月 03, 2017

After 52d7523d (arm64: mm: allow the kernel to handle alignment faults on
user accesses) commit user-land accesses that produce unaligned exceptions
like in case of aarch32 ldm/stm/ldrd/strd instructions operating on
unaligned memory received by user-land as SIGSEGV. It is wrong, it should
be reported as SIGBUS as it was before 52d7523d commit.

Changed do_bad_area function to take signal and code parameters out of esr
value using fault_info table, so in case of do_alignment_fault fault
user-land will receive SIGBUS. Wrapped access to fault_info table into
esr_to_fault_info function.

Cc: <stable@vger.kernel.org>
Fixes: 52d7523d (arm64: mm: allow the kernel to handle alignment faults on user accesses)
Signed-off-by: NVictor Kamensky <kamensky@cisco.com>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

09a6adf5

02 3月, 2017 2 次提交

sched/headers: Prepare for new header dependencies before moving code to <linux/sched/debug.h> · b17b0153

由 Ingo Molnar 提交于 2月 08, 2017

We are going to split <linux/sched/debug.h> out of <linux/sched.h>, which
will have to be picked up from other headers and a couple of .c files.

Create a trivial placeholder <linux/sched/debug.h> file that just
maps to <linux/sched.h> to make this patch obviously correct and
bisectable.

Include the new header in the files that are going to need it.
Acked-by: NLinus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: NIngo Molnar <mingo@kernel.org>

b17b0153

sched/headers: Prepare for new header dependencies before moving code to <linux/sched/signal.h> · 3f07c014

由 Ingo Molnar 提交于 2月 08, 2017

We are going to split <linux/sched/signal.h> out of <linux/sched.h>, which
will have to be picked up from other headers and a couple of .c files.

Create a trivial placeholder <linux/sched/signal.h> file that just
maps to <linux/sched.h> to make this patch obviously correct and
bisectable.

Include the new header in the files that are going to need it.
Acked-by: NLinus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: NIngo Molnar <mingo@kernel.org>

3f07c014

10 1月, 2017 1 次提交

arm64: Remove useless UAO IPI and describe how this gets enabled · c8b06e3f

由 James Morse 提交于 1月 09, 2017

Since its introduction, the UAO enable call was broken, and useless.
commit 2a6dcb2b ("arm64: cpufeature: Schedule enable() calls instead
of calling them via IPI"), fixed the framework so that these calls
are scheduled, so that they can modify PSTATE.

Now it is just useless. Remove it. UAO is enabled by the code patching
which causes get_user() and friends to use the 'ldtr' family of
instructions. This relies on the PSTATE.UAO bit being set to match
addr_limit, which we do in uao_thread_switch() called via __switch_to().

All that is needed to enable UAO is patch the code, and call schedule().
__apply_alternatives_multi_stop() calls stop_machine() when it modifies
the kernel text to enable the alternatives, (including the UAO code in
uao_thread_switch()). Once stop_machine() has finished __switch_to() is
called to reschedule the original task, this causes PSTATE.UAO to be set
appropriately. An explicit enable() call is not needed.
Reported-by: NVladimir Murzin <vladimir.murzin@arm.com>
Signed-off-by: NJames Morse <james.morse@arm.com>

c8b06e3f

05 1月, 2017 1 次提交

arm64: mm: fix show_pte KERN_CONT fallout · 6ef4fb38

由 Mark Rutland 提交于 1月 03, 2017

Recent changes made KERN_CONT mandatory for continued lines. In the
absence of KERN_CONT, a newline may be implicit inserted by the core
printk code.

In show_pte, we (erroneously) use printk without KERN_CONT for continued
prints, resulting in output being split across a number of lines, and
not matching the intended output, e.g.

[ff000000000000] *pgd=00000009f511b003
, *pud=00000009f4a80003
, *pmd=0000000000000000

Fix this by using pr_cont() for all the continuations.
Acked-by: NWill Deacon <will.deacon@arm.com>
Signed-off-by: NMark Rutland <mark.rutland@arm.com>
Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>

6ef4fb38

openeuler / Kernel 11 个月 前同步成功

openeuler / Kernel
11 个月前同步成功