1. 01 12月, 2021 2 次提交
  2. 24 11月, 2021 1 次提交
    • M
      arm64: uaccess: avoid blocking within critical sections · 94902d84
      Mark Rutland 提交于
      As Vincent reports in:
      
        https://lore.kernel.org/r/20211118163417.21617-1-vincent.whitchurch@axis.com
      
      The put_user() in schedule_tail() can get stuck in a livelock, similar
      to a problem recently fixed on riscv in commit:
      
        285a76bb ("riscv: evaluate put_user() arg before enabling user access")
      
      In __raw_put_user() we have a critical section between
      uaccess_ttbr0_enable() and uaccess_ttbr0_disable() where we cannot
      safely call into the scheduler without having taken an exception, as
      schedule() and other scheduling functions will not save/restore the
      TTBR0 state. If either of the `x` or `ptr` arguments to __raw_put_user()
      contain a blocking call, we may call into the scheduler within the
      critical section. This can result in two problems:
      
      1) The access within the critical section will occur without the
         required TTBR0 tables installed. This will fault, and where the
         required tables permit access, the access will be retried without the
         required tables, resulting in a livelock.
      
      2) When TTBR0 SW PAN is in use, check_and_switch_context() does not
         modify TTBR0, leaving a stale value installed. The mappings of the
         blocked task will erroneously be accessible to regular accesses in
         the context of the new task. Additionally, if the tables are
         subsequently freed, local TLB maintenance required to reuse the ASID
         may be lost, potentially resulting in TLB corruption (e.g. in the
         presence of CnP).
      
      The same issue exists for __raw_get_user() in the critical section
      between uaccess_ttbr0_enable() and uaccess_ttbr0_disable().
      
      A similar issue exists for __get_kernel_nofault() and
      __put_kernel_nofault() for the critical section between
      __uaccess_enable_tco_async() and __uaccess_disable_tco_async(), as the
      TCO state is not context-switched by direct calls into the scheduler.
      Here the TCO state may be lost from the context of the current task,
      resulting in unexpected asynchronous tag check faults. It may also be
      leaked to another task, suppressing expected tag check faults.
      
      To fix all of these cases, we must ensure that we do not directly call
      into the scheduler in their respective critical sections. This patch
      reworks __raw_put_user(), __raw_get_user(), __get_kernel_nofault(), and
      __put_kernel_nofault(), ensuring that parameters are evaluated outside
      of the critical sections. To make this requirement clear, comments are
      added describing the problem, and line spaces added to separate the
      critical sections from other portions of the macros.
      
      For __raw_get_user() and __raw_put_user() the `err` parameter is
      conditionally assigned to, and we must currently evaluate this in the
      critical section. This behaviour is relied upon by the signal code,
      which uses chains of put_user_error() and get_user_error(), checking the
      return value at the end. In all cases, the `err` parameter is a plain
      int rather than a more complex expression with a blocking call, so this
      is safe.
      
      In future we should try to clean up the `err` usage to remove the
      potential for this to be a problem.
      
      Aside from the changes to time of evaluation, there should be no
      functional change as a result of this patch.
      Reported-by: NVincent Whitchurch <vincent.whitchurch@axis.com>
      Link: https://lore.kernel.org/r/20211118163417.21617-1-vincent.whitchurch@axis.com
      Fixes: f253d827 ("arm64: uaccess: refactor __{get,put}_user")
      Signed-off-by: NMark Rutland <mark.rutland@arm.com>
      Cc: Will Deacon <will@kernel.org>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Link: https://lore.kernel.org/r/20211122125820.55286-1-mark.rutland@arm.comSigned-off-by: NWill Deacon <will@kernel.org>
      94902d84
  3. 16 11月, 2021 2 次提交
    • P
      arm64: mm: Fix VM_BUG_ON(mm != &init_mm) for trans_pgd · d3eb70ea
      Pingfan Liu 提交于
      trans_pgd_create_copy() can hit "VM_BUG_ON(mm != &init_mm)" in the
      function pmd_populate_kernel().
      
      This is the combined consequence of commit 5de59884 ("arm64:
      trans_pgd: pass NULL instead of init_mm to *_populate functions"), which
      replaced &init_mm with NULL and commit 59511cfd ("arm64: mm: use XN
      table mapping attributes for user/kernel mappings"), which introduced
      the VM_BUG_ON.
      
      Since the former sounds reasonable, it is better to work on the later.
      From the perspective of trans_pgd, two groups of functions are
      considered in the later one:
      
        pmd_populate_kernel()
          mm == NULL should be fixed, else it hits VM_BUG_ON()
        p?d_populate()
          mm == NULL means PXN, that is OK, since trans_pgd only copies a
          linear map, no execution will happen on the map.
      
      So it is good enough to just relax VM_BUG_ON() to disregard mm == NULL
      
      Fixes: 59511cfd ("arm64: mm: use XN table mapping attributes for user/kernel mappings")
      Signed-off-by: NPingfan Liu <kernelfans@gmail.com>
      Cc: <stable@vger.kernel.org> # 5.13.x
      Cc: Ard Biesheuvel <ardb@kernel.org>
      Cc: James Morse <james.morse@arm.com>
      Cc: Matthias Brugger <mbrugger@suse.com>
      Reviewed-by: NCatalin Marinas <catalin.marinas@arm.com>
      Reviewed-by: NPasha Tatashin <pasha.tatashin@soleen.com>
      Link: https://lore.kernel.org/r/20211112052214.9086-1-kernelfans@gmail.comSigned-off-by: NWill Deacon <will@kernel.org>
      d3eb70ea
    • M
      arm64: ftrace: use HAVE_FUNCTION_GRAPH_RET_ADDR_PTR · c6d3cd32
      Mark Rutland 提交于
      When CONFIG_FUNCTION_GRAPH_TRACER is selected and the function graph
      tracer is in use, unwind_frame() may erroneously associate a traced
      function with an incorrect return address. This can happen when starting
      an unwind from a pt_regs, or when unwinding across an exception
      boundary.
      
      This can be seen when recording with perf while the function graph
      tracer is in use. For example:
      
      | # echo function_graph > /sys/kernel/debug/tracing/current_tracer
      | # perf record -g -e raw_syscalls:sys_enter:k /bin/true
      | # perf report
      
      ... reports the callchain erroneously as:
      
      | el0t_64_sync
      | el0t_64_sync_handler
      | el0_svc_common.constprop.0
      | perf_callchain
      | get_perf_callchain
      | syscall_trace_enter
      | syscall_trace_enter
      
      ... whereas when the function graph tracer is not in use, it reports:
      
      | el0t_64_sync
      | el0t_64_sync_handler
      | el0_svc
      | do_el0_svc
      | el0_svc_common.constprop.0
      | syscall_trace_enter
      | syscall_trace_enter
      
      The underlying problem is that ftrace_graph_get_ret_stack() takes an
      index offset from the most recent entry added to the fgraph return
      stack. We start an unwind at offset 0, and increment the offset each
      time we encounter a rewritten return address (i.e. when we see
      `return_to_handler`). This is broken in two cases:
      
      1) Between creating a pt_regs and starting the unwind, function calls
         may place entries on the stack, leaving an arbitrary offset which we
         can only determine by performing a full unwind from the caller of the
         unwind code (and relying on none of the unwind code being
         instrumented).
      
         This can result in erroneous entries being reported in a backtrace
         recorded by perf or kfence when the function graph tracer is in use.
         Currently show_regs() is unaffected as dump_backtrace() performs an
         initial unwind.
      
      2) When unwinding across an exception boundary (whether continuing an
         unwind or starting a new unwind from regs), we currently always skip
         the LR of the interrupted context. Where this was live and contained
         a rewritten address, we won't consume the corresponding fgraph ret
         stack entry, leaving subsequent entries off-by-one.
      
         This can result in erroneous entries being reported in a backtrace
         performed by any in-kernel unwinder when that backtrace crosses an
         exception boundary, with entries after the boundary being reported
         incorrectly. This includes perf, kfence, show_regs(), panic(), etc.
      
      To fix this, we need to be able to uniquely identify each rewritten
      return address such that we can map this back to the original return
      address. We can use HAVE_FUNCTION_GRAPH_RET_ADDR_PTR to associate
      each rewritten return address with a unique location on the stack. As
      the return address is passed in the LR (and so is not guaranteed a
      unique location in memory), we use the FP upon entry to the function
      (i.e. the address of the caller's frame record) as the return address
      pointer. Any nested call will have a different FP value as the caller
      must create its own frame record and update FP to point to this.
      
      Since ftrace_graph_ret_addr() requires the return address with the PAC
      stripped, the stripping of the PAC is moved before the fixup of the
      rewritten address. As we would unconditionally strip the PAC, moving
      this earlier is not harmful, and we can avoid a redundant strip in the
      return address fixup code.
      
      I've tested this with the perf case above, the ftrace selftests, and
      a number of ad-hoc unwinder tests. The tests all pass, and I have seen
      no unexpected behaviour as a result of this change. I've tested with
      pointer authentication under QEMU TCG where magic-sysrq+l correctly
      recovers the original return addresses.
      
      Note that this doesn't fix the issue of skipping a live LR at an
      exception boundary, which is a more general problem and requires more
      substantial rework. Were we to consume the LR in all cases this would
      result in warnings where the interrupted context's LR contains
      `return_to_handler`, but the FP has been altered, e.g.
      
      | func:
      |	<--- ftrace entry ---> 	// logs FP & LR, rewrites LR
      | 	STP	FP, LR, [SP, #-16]!
      | 	MOV	FP, SP
      | 	<--- INTERRUPT --->
      
      ... as ftrace_graph_get_ret_stack() fill not find a matching entry,
      triggering the WARN_ON_ONCE() in unwind_frame().
      
      Link: https://lore.kernel.org/r/20211025164925.GB2001@C02TD0UTHF1T.local
      Link: https://lore.kernel.org/r/20211027132529.30027-1-mark.rutland@arm.comSigned-off-by: NMark Rutland <mark.rutland@arm.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Madhavan T. Venkataraman <madvenka@linux.microsoft.com>
      Cc: Mark Brown <broonie@kernel.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Will Deacon <will@kernel.org>
      Reviewed-by: NMark Brown <broonie@kernel.org>
      Link: https://lore.kernel.org/r/20211029162245.39761-1-mark.rutland@arm.comSigned-off-by: NWill Deacon <will@kernel.org>
      c6d3cd32
  4. 08 11月, 2021 3 次提交
    • Y
      KVM: arm64: Change the return type of kvm_vcpu_preferred_target() · 08e873cb
      YueHaibing 提交于
      kvm_vcpu_preferred_target() always return 0 because kvm_target_cpu()
      never returns a negative error code.
      Signed-off-by: NYueHaibing <yuehaibing@huawei.com>
      Reviewed-by: NAlexandru Elisei <alexandru.elisei@arm.com>
      Signed-off-by: NMarc Zyngier <maz@kernel.org>
      Link: https://lore.kernel.org/r/20211105011500.16280-1-yuehaibing@huawei.com
      08e873cb
    • M
      KVM: arm64: Extract ESR_ELx.EC only · 8bb08411
      Mark Rutland 提交于
      Since ARMv8.0 the upper 32 bits of ESR_ELx have been RES0, and recently
      some of the upper bits gained a meaning and can be non-zero. For
      example, when FEAT_LS64 is implemented, ESR_ELx[36:32] contain ISS2,
      which for an ST64BV or ST64BV0 can be non-zero. This can be seen in ARM
      DDI 0487G.b, page D13-3145, section D13.2.37.
      
      Generally, we must not rely on RES0 bit remaining zero in future, and
      when extracting ESR_ELx.EC we must mask out all other bits.
      
      All C code uses the ESR_ELx_EC() macro, which masks out the irrelevant
      bits, and therefore no alterations are required to C code to avoid
      consuming irrelevant bits.
      
      In a couple of places the KVM assembly extracts ESR_ELx.EC using LSR on
      an X register, and so could in theory consume previously RES0 bits. In
      both cases this is for comparison with EC values ESR_ELx_EC_HVC32 and
      ESR_ELx_EC_HVC64, for which the upper bits of ESR_ELx must currently be
      zero, but this could change in future.
      
      This patch adjusts the KVM vectors to use UBFX rather than LSR to
      extract ESR_ELx.EC, ensuring these are robust to future additions to
      ESR_ELx.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: NMark Rutland <mark.rutland@arm.com>
      Cc: Alexandru Elisei <alexandru.elisei@arm.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: James Morse <james.morse@arm.com>
      Cc: Marc Zyngier <maz@kernel.org>
      Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
      Cc: Will Deacon <will@kernel.org>
      Acked-by: NWill Deacon <will@kernel.org>
      Signed-off-by: NMarc Zyngier <maz@kernel.org>
      Link: https://lore.kernel.org/r/20211103110545.4613-1-mark.rutland@arm.com
      8bb08411
    • A
      arm64: pgtable: make __pte_to_phys/__phys_to_pte_val inline functions · c7c386fb
      Arnd Bergmann 提交于
      gcc warns about undefined behavior the vmalloc code when building
      with CONFIG_ARM64_PA_BITS_52, when the 'idx++' in the argument to
      __phys_to_pte_val() is evaluated twice:
      
      mm/vmalloc.c: In function 'vmap_pfn_apply':
      mm/vmalloc.c:2800:58: error: operation on 'data->idx' may be undefined [-Werror=sequence-point]
       2800 |         *pte = pte_mkspecial(pfn_pte(data->pfns[data->idx++], data->prot));
            |                                                 ~~~~~~~~~^~
      arch/arm64/include/asm/pgtable-types.h:25:37: note: in definition of macro '__pte'
         25 | #define __pte(x)        ((pte_t) { (x) } )
            |                                     ^
      arch/arm64/include/asm/pgtable.h:80:15: note: in expansion of macro '__phys_to_pte_val'
         80 |         __pte(__phys_to_pte_val((phys_addr_t)(pfn) << PAGE_SHIFT) | pgprot_val(prot))
            |               ^~~~~~~~~~~~~~~~~
      mm/vmalloc.c:2800:30: note: in expansion of macro 'pfn_pte'
       2800 |         *pte = pte_mkspecial(pfn_pte(data->pfns[data->idx++], data->prot));
            |                              ^~~~~~~
      
      I have no idea why this never showed up earlier, but the safest
      workaround appears to be changing those macros into inline functions
      so the arguments get evaluated only once.
      
      Cc: Matthew Wilcox <willy@infradead.org>
      Fixes: 75387b92 ("arm64: handle 52-bit physical addresses in page table entries")
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      Link: https://lore.kernel.org/r/20211105075414.2553155-1-arnd@kernel.orgSigned-off-by: NWill Deacon <will@kernel.org>
      c7c386fb
  5. 26 10月, 2021 1 次提交
  6. 23 10月, 2021 1 次提交
  7. 22 10月, 2021 2 次提交
  8. 21 10月, 2021 17 次提交
    • M
      arm64: extable: add load_unaligned_zeropad() handler · 753b3236
      Mark Rutland 提交于
      For inline assembly, we place exception fixups out-of-line in the
      `.fixup` section such that these are out of the way of the fast path.
      This has a few drawbacks:
      
      * Since the fixup code is anonymous, backtraces will symbolize fixups as
        offsets from the nearest prior symbol, currently
        `__entry_tramp_text_end`. This is confusing, and painful to debug
        without access to the relevant vmlinux.
      
      * Since the exception handler adjusts the PC to execute the fixup, and
        the fixup uses a direct branch back into the function it fixes,
        backtraces of fixups miss the original function. This is confusing,
        and violates requirements for RELIABLE_STACKTRACE (and therefore
        LIVEPATCH).
      
      * Inline assembly and associated fixups are generated from templates,
        and we have many copies of logically identical fixups which only
        differ in which specific registers are written to and which address is
        branched to at the end of the fixup. This is potentially wasteful of
        I-cache resources, and makes it hard to add additional logic to fixups
        without significant bloat.
      
      * In the case of load_unaligned_zeropad(), the logic in the fixup
        requires a temporary register that we must allocate even in the
        fast-path where it will not be used.
      
      This patch address all four concerns for load_unaligned_zeropad() fixups
      by adding a dedicated exception handler which performs the fixup logic
      in exception context and subsequent returns back after the faulting
      instruction. For the moment, the fixup logic is identical to the old
      assembly fixup logic, but in future we could enhance this by taking the
      ESR and FAR into account to constrain the faults we try to fix up, or to
      specialize fixups for MTE tag check faults.
      
      Other than backtracing, there should be no functional change as a result
      of this patch.
      Signed-off-by: NMark Rutland <mark.rutland@arm.com>
      Reviewed-by: NArd Biesheuvel <ardb@kernel.org>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: James Morse <james.morse@arm.com>
      Cc: Robin Murphy <robin.murphy@arm.com>
      Cc: Will Deacon <will@kernel.org>
      Link: https://lore.kernel.org/r/20211019160219.5202-13-mark.rutland@arm.comSigned-off-by: NWill Deacon <will@kernel.org>
      753b3236
    • M
      arm64: extable: add a dedicated uaccess handler · 2e77a62c
      Mark Rutland 提交于
      For inline assembly, we place exception fixups out-of-line in the
      `.fixup` section such that these are out of the way of the fast path.
      This has a few drawbacks:
      
      * Since the fixup code is anonymous, backtraces will symbolize fixups as
        offsets from the nearest prior symbol, currently
        `__entry_tramp_text_end`. This is confusing, and painful to debug
        without access to the relevant vmlinux.
      
      * Since the exception handler adjusts the PC to execute the fixup, and
        the fixup uses a direct branch back into the function it fixes,
        backtraces of fixups miss the original function. This is confusing,
        and violates requirements for RELIABLE_STACKTRACE (and therefore
        LIVEPATCH).
      
      * Inline assembly and associated fixups are generated from templates,
        and we have many copies of logically identical fixups which only
        differ in which specific registers are written to and which address is
        branched to at the end of the fixup. This is potentially wasteful of
        I-cache resources, and makes it hard to add additional logic to fixups
        without significant bloat.
      
      This patch address all three concerns for inline uaccess fixups by
      adding a dedicated exception handler which updates registers in
      exception context and subsequent returns back into the function which
      faulted, removing the need for fixups specialized to each faulting
      instruction.
      
      Other than backtracing, there should be no functional change as a result
      of this patch.
      Signed-off-by: NMark Rutland <mark.rutland@arm.com>
      Reviewed-by: NArd Biesheuvel <ardb@kernel.org>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: James Morse <james.morse@arm.com>
      Cc: Robin Murphy <robin.murphy@arm.com>
      Cc: Will Deacon <will@kernel.org>
      Link: https://lore.kernel.org/r/20211019160219.5202-12-mark.rutland@arm.comSigned-off-by: NWill Deacon <will@kernel.org>
      2e77a62c
    • M
      arm64: extable: add `type` and `data` fields · d6e2cc56
      Mark Rutland 提交于
      Subsequent patches will add specialized handlers for fixups, in addition
      to the simple PC fixup and BPF handlers we have today. In preparation,
      this patch adds a new `type` field to struct exception_table_entry, and
      uses this to distinguish the fixup and BPF cases. A `data` field is also
      added so that subsequent patches can associate data specific to each
      exception site (e.g. register numbers).
      
      Handlers are named ex_handler_*() for consistency, following the exmaple
      of x86. At the same time, get_ex_fixup() is split out into a helper so
      that it can be used by other ex_handler_*() functions ins subsequent
      patches.
      
      This patch will increase the size of the exception tables, which will be
      remedied by subsequent patches removing redundant fixup code. There
      should be no functional change as a result of this patch.
      
      Since each entry is now 12 bytes in size, we must reduce the alignment
      of each entry from `.align 3` (i.e. 8 bytes) to `.align 2` (i.e. 4
      bytes), which is the natrual alignment of the `insn` and `fixup` fields.
      The current 8-byte alignment is a holdover from when the `insn` and
      `fixup` fields was 8 bytes, and while not harmful has not been necessary
      since commit:
      
        6c94f27a ("arm64: switch to relative exception tables")
      
      Similarly, RO_EXCEPTION_TABLE_ALIGN is dropped to 4 bytes.
      
      Concurrently with this patch, x86's exception table entry format is
      being updated (similarly to a 12-byte format, with 32-bytes of absolute
      data). Once both have been merged it should be possible to unify the
      sorttable logic for the two.
      Signed-off-by: NMark Rutland <mark.rutland@arm.com>
      Reviewed-by: NArd Biesheuvel <ardb@kernel.org>
      Cc: Alexei Starovoitov <ast@kernel.org>
      Cc: Andrii Nakryiko <andrii@kernel.org>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Daniel Borkmann <daniel@iogearbox.net>
      Cc: James Morse <james.morse@arm.com>
      Cc: Jean-Philippe Brucker <jean-philippe@linaro.org>
      Cc: Robin Murphy <robin.murphy@arm.com>
      Cc: Will Deacon <will@kernel.org>
      Link: https://lore.kernel.org/r/20211019160219.5202-11-mark.rutland@arm.comSigned-off-by: NWill Deacon <will@kernel.org>
      d6e2cc56
    • M
      arm64: extable: make fixup_exception() return bool · e8c328d7
      Mark Rutland 提交于
      The return values of fixup_exception() and arm64_bpf_fixup_exception()
      represent a boolean condition rather than an error code, so for clarity
      it would be better to return `bool` rather than `int`.
      
      This patch adjusts the code accordingly. While we're modifying the
      prototype, we also remove the unnecessary `extern` keyword, so that this
      won't look out of place when we make subsequent additions to the header.
      
      There should be no functional change as a result of this patch.
      Signed-off-by: NMark Rutland <mark.rutland@arm.com>
      Reviewed-by: NArd Biesheuvel <ardb@kernel.org>
      Cc: Alexei Starovoitov <ast@kernel.org>
      Cc: Andrii Nakryiko <andrii@kernel.org>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Daniel Borkmann <daniel@iogearbox.net>
      Cc: James Morse <james.morse@arm.com>
      Cc: Jean-Philippe Brucker <jean-philippe@linaro.org>
      Cc: Robin Murphy <robin.murphy@arm.com>
      Cc: Will Deacon <will@kernel.org>
      Link: https://lore.kernel.org/r/20211019160219.5202-9-mark.rutland@arm.comSigned-off-by: NWill Deacon <will@kernel.org>
      e8c328d7
    • M
      arm64: extable: consolidate definitions · 819771cc
      Mark Rutland 提交于
      In subsequent patches we'll alter the structure and usage of struct
      exception_table_entry. For inline assembly, we create these using the
      `_ASM_EXTABLE()` CPP macro defined in <asm/uaccess.h>, and for plain
      assembly code we use the `_asm_extable()` GAS macro defined in
      <asm/assembler.h>, which are largely identical save for different
      escaping and stringification requirements.
      
      This patch moves the common definitions to a new <asm/asm-extable.h>
      header, so that it's easier to keep the two in-sync, and to remove the
      implication that these are only used for uaccess helpers (as e.g.
      load_unaligned_zeropad() is only used on kernel memory, and depends upon
      `_ASM_EXTABLE()`.
      
      At the same time, a few minor modifications are made for clarity and in
      preparation for subsequent patches:
      
      * The structure creation is factored out into an `__ASM_EXTABLE_RAW()`
        macro. This will make it easier to support different fixup variants in
        subsequent patches without needing to update all users of
        `_ASM_EXTABLE()`, and makes it easier to see tha the CPP and GAS
        variants of the macros are structurally identical.
      
        For the CPP macro, the stringification of fields is left to the
        wrapper macro, `_ASM_EXTABLE()`, as in subsequent patches it will be
        necessary to stringify fields in wrapper macros to safely concatenate
        strings which cannot be token-pasted together in CPP.
      
      * The fields of the structure are created separately on their own lines.
        This will make it easier to add/remove/modify individual fields
        clearly.
      
      * Additional parentheses are added around the use of macro arguments in
        field definitions to avoid any potential problems with evaluation due
        to operator precedence, and to make errors upon misuse clearer.
      
      * USER() is moved into <asm/asm-uaccess.h>, as it is not required by all
        assembly code, and is already refered to by comments in that file.
      
      There should be no functional change as a result of this patch.
      Signed-off-by: NMark Rutland <mark.rutland@arm.com>
      Reviewed-by: NArd Biesheuvel <ardb@kernel.org>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: James Morse <james.morse@arm.com>
      Cc: Robin Murphy <robin.murphy@arm.com>
      Cc: Will Deacon <will@kernel.org>
      Link: https://lore.kernel.org/r/20211019160219.5202-8-mark.rutland@arm.comSigned-off-by: NWill Deacon <will@kernel.org>
      819771cc
    • M
      arm64: gpr-num: support W registers · 286fba6c
      Mark Rutland 提交于
      In subsequent patches we'll want to map W registers to their register
      numbers. Update gpr-num.h so that we can do this.
      Signed-off-by: NMark Rutland <mark.rutland@arm.com>
      Reviewed-by: NArd Biesheuvel <ardb@kernel.org>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: James Morse <james.morse@arm.com>
      Cc: Robin Murphy <robin.murphy@arm.com>
      Cc: Will Deacon <will@kernel.org>
      Link: https://lore.kernel.org/r/20211019160219.5202-7-mark.rutland@arm.comSigned-off-by: NWill Deacon <will@kernel.org>
      286fba6c
    • M
      arm64: factor out GPR numbering helpers · 8ed1b498
      Mark Rutland 提交于
      In <asm/sysreg.h> we have macros to convert the names of general purpose
      registers (GPRs) into integer constants, which we use to manually build
      the encoding for `MRS` and `MSR` instructions where we can't rely on the
      assembler to do so for us.
      
      In subsequent patches we'll need to map the same GPR names to integer
      constants so that we can use this to build metadata for exception
      fixups.
      
      So that the we can use the mappings elsewhere, factor out the
      definitions into a new <asm/gpr-num.h> header, renaming the definitions
      to align with this "GPR num" naming for clarity.
      
      There should be no functional change as a result of this patch.
      Signed-off-by: NMark Rutland <mark.rutland@arm.com>
      Reviewed-by: NArd Biesheuvel <ardb@kernel.org>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: James Morse <james.morse@arm.com>
      Cc: Robin Murphy <robin.murphy@arm.com>
      Cc: Will Deacon <will@kernel.org>
      Link: https://lore.kernel.org/r/20211019160219.5202-6-mark.rutland@arm.comSigned-off-by: NWill Deacon <will@kernel.org>
      8ed1b498
    • M
      arm64: kvm: use kvm_exception_table_entry · ae2b2f33
      Mark Rutland 提交于
      In subsequent patches we'll alter `struct exception_table_entry`, adding
      fields that are not needed for KVM exception fixups.
      
      In preparation for this, migrate KVM to its own `struct
      kvm_exception_table_entry`, which is identical to the current format of
      `struct exception_table_entry`. Comments are updated accordingly.
      
      There should be no functional change as a result of this patch.
      Signed-off-by: NMark Rutland <mark.rutland@arm.com>
      Reviewed-by: NArd Biesheuvel <ardb@kernel.org>
      Cc: Alexandru Elisei <alexandru.elisei@arm.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: James Morse <james.morse@arm.com>
      Cc: Marc Zyngier <maz@kernel.org>
      Cc: Robin Murphy <robin.murphy@arm.com>
      Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
      Cc: Will Deacon <will@kernel.org>
      Acked-by: NMarc Zyngier <maz@kernel.org>
      Link: https://lore.kernel.org/r/20211019160219.5202-5-mark.rutland@arm.comSigned-off-by: NWill Deacon <will@kernel.org>
      ae2b2f33
    • N
      arm64: vdso32: drop test for -march=armv8-a · a517faa9
      Nick Desaulniers 提交于
      As Arnd points out:
        gcc-4.8 already supported -march=armv8, and we require gcc-5.1 now, so
        both this #if/#else construct and the corresponding
        "cc32-option,-march=armv8-a" check should be obsolete now.
      
      Link: https://lore.kernel.org/lkml/CAK8P3a3UBEJ0Py2ycz=rHfgog8g3mCOeQOwO0Gmp-iz6Uxkapg@mail.gmail.com/Suggested-by: NArnd Bergmann <arnd@arndb.de>
      Signed-off-by: NNick Desaulniers <ndesaulniers@google.com>
      Reviewed-by: NVincenzo Frascino <vincenzo.frascino@arm.com>
      Reviewed-by: NNathan Chancellor <nathan@kernel.org>
      Link: https://lore.kernel.org/r/20211019223646.1146945-3-ndesaulniers@google.comSigned-off-by: NWill Deacon <will@kernel.org>
      a517faa9
    • N
      arm64: vdso32: drop the test for dmb ishld · 1907d3ff
      Nick Desaulniers 提交于
      Binutils added support for this instruction in commit
      e797f7e0b2bedc9328d4a9a0ebc63ca7a2dbbebc which shipped in 2.24 (just
      missing the 2.23 release) but was cherry-picked into 2.23 in commit
      27a50d6755bae906bc73b4ec1a8b448467f0bea1. Thanks to Christian and Simon
      for helping me with the patch archaeology.
      
      According to Documentation/process/changes.rst, the minimum supported
      version of binutils is 2.23. Since all supported versions of GAS support
      this instruction, drop the assembler invocation, preprocessor
      flags/guards, and the cross assembler macro that's now unused.
      
      This also avoids a recursive self reference in a follow up cleanup
      patch.
      
      Cc: Christian Biesinger <cbiesinger@google.com>
      Cc: Simon Marchi <simon.marchi@polymtl.ca>
      Signed-off-by: NNick Desaulniers <ndesaulniers@google.com>
      Reviewed-by: NVincenzo Frascino <vincenzo.frascino@arm.com>
      Reviewed-by: NNathan Chancellor <nathan@kernel.org>
      Link: https://lore.kernel.org/r/20211019223646.1146945-2-ndesaulniers@google.comSigned-off-by: NWill Deacon <will@kernel.org>
      1907d3ff
    • M
      arm64/sve: Track vector lengths for tasks in an array · 5838a155
      Mark Brown 提交于
      As for SVE we will track a per task SME vector length for tasks. Convert
      the existing storage for the vector length into an array and update
      fpsimd_flush_task() to initialise this in a function.
      Signed-off-by: NMark Brown <broonie@kernel.org>
      Link: https://lore.kernel.org/r/20211019172247.3045838-10-broonie@kernel.orgSigned-off-by: NWill Deacon <will@kernel.org>
      5838a155
    • M
      arm64/sve: Explicitly load vector length when restoring SVE state · ddc806b5
      Mark Brown 提交于
      Currently when restoring the SVE state we supply the SVE vector length
      as an argument to sve_load_state() and the underlying macros. This becomes
      inconvenient with the addition of SME since we may need to restore any
      combination of SVE and SME vector lengths, and we already separately
      restore the vector length in the KVM code. We don't need to know the vector
      length during the actual register load since the SME load instructions can
      index into the data array for us.
      
      Refactor the interface so we explicitly set the vector length separately
      to restoring the SVE registers in preparation for adding SME support, no
      functional change should be involved.
      Signed-off-by: NMark Brown <broonie@kernel.org>
      Link: https://lore.kernel.org/r/20211019172247.3045838-9-broonie@kernel.orgSigned-off-by: NWill Deacon <will@kernel.org>
      ddc806b5
    • M
      arm64/sve: Put system wide vector length information into structs · b5bc00ff
      Mark Brown 提交于
      With the introduction of SME we will have a second vector length in the
      system, enumerated and configured in a very similar fashion to the
      existing SVE vector length.  While there are a few differences in how
      things are handled this is a relatively small portion of the overall
      code so in order to avoid code duplication we factor out
      
      We create two structs, one vl_info for the static hardware properties
      and one vl_config for the runtime configuration, with an array
      instantiated for each and update all the users to reference these. Some
      accessor functions are provided where helpful for readability, and the
      write to set the vector length is put into a function since the system
      register being updated needs to be chosen at compile time.
      
      This is a mostly mechanical replacement, further work will be required
      to actually make things generic, ensuring that we handle those places
      where there are differences properly.
      Signed-off-by: NMark Brown <broonie@kernel.org>
      Link: https://lore.kernel.org/r/20211019172247.3045838-8-broonie@kernel.orgSigned-off-by: NWill Deacon <will@kernel.org>
      b5bc00ff
    • M
      arm64/sve: Use accessor functions for vector lengths in thread_struct · 0423eedc
      Mark Brown 提交于
      In a system with SME there are parallel vector length controls for SVE and
      SME vectors which function in much the same way so it is desirable to
      share the code for handling them as much as possible. In order to prepare
      for doing this add a layer of accessor functions for the various VL related
      operations on tasks.
      
      Since almost all current interactions are actually via task->thread rather
      than directly with the thread_info the accessors use that. Accessors are
      provided for both generic and SVE specific usage, the generic accessors
      should be used for cases where register state is being manipulated since
      the registers are shared between streaming and regular SVE so we know that
      when SME support is implemented we will always have to be in the appropriate
      mode already and hence can generalise now.
      
      Since we are using task_struct and we don't want to cause widespread
      inclusion of sched.h the acessors are all out of line, it is hoped that
      none of the uses are in a sufficiently critical path for this to be an
      issue. Those that are most likely to present an issue are in the same
      translation unit so hopefully the compiler may be able to inline anyway.
      
      This is purely adding the layer of abstraction, additional work will be
      needed to support tasks using SME.
      Signed-off-by: NMark Brown <broonie@kernel.org>
      Link: https://lore.kernel.org/r/20211019172247.3045838-7-broonie@kernel.orgSigned-off-by: NWill Deacon <will@kernel.org>
      0423eedc
    • M
      arm64/sve: Make access to FFR optional · 9f584866
      Mark Brown 提交于
      SME introduces streaming SVE mode in which FFR is not present and the
      instructions for accessing it UNDEF. In preparation for handling this
      update the low level SVE state access functions to take a flag specifying
      if FFR should be handled. When saving the register state we store a zero
      for FFR to guard against uninitialized data being read. No behaviour change
      should be introduced by this patch.
      Signed-off-by: NMark Brown <broonie@kernel.org>
      Link: https://lore.kernel.org/r/20211019172247.3045838-5-broonie@kernel.orgSigned-off-by: NWill Deacon <will@kernel.org>
      9f584866
    • M
      arm64/sve: Make sve_state_size() static · 12cc2352
      Mark Brown 提交于
      There are no users outside fpsimd.c so make sve_state_size() static.
      KVM open codes an equivalent.
      Signed-off-by: NMark Brown <broonie@kernel.org>
      Link: https://lore.kernel.org/r/20211019172247.3045838-4-broonie@kernel.orgSigned-off-by: NWill Deacon <will@kernel.org>
      12cc2352
    • M
      arm64/sve: Remove sve_load_from_fpsimd_state() · b53223e0
      Mark Brown 提交于
      Following optimisations of the SVE register handling we no longer load the
      SVE state from a saved copy of the FPSIMD registers, we convert directly
      in registers or from one saved state to another. Remove the function so we
      don't need to update it during further refactoring.
      Signed-off-by: NMark Brown <broonie@kernel.org>
      Link: https://lore.kernel.org/r/20211019172247.3045838-3-broonie@kernel.orgSigned-off-by: NWill Deacon <will@kernel.org>
      b53223e0
  9. 19 10月, 2021 3 次提交
  10. 18 10月, 2021 7 次提交
  11. 17 10月, 2021 1 次提交
    • M
      KVM: arm64: vgic-v3: Reduce common group trapping to ICV_DIR_EL1 when possible · 0924729b
      Marc Zyngier 提交于
      On systems that advertise ICH_VTR_EL2.SEIS, we trap all GICv3 sysreg
      accesses from the guest. From a performance perspective, this is OK
      as long as the guest doesn't hammer the GICv3 CPU interface.
      
      In most cases, this is fine, unless the guest actively uses
      priorities and switches PMR_EL1 very often. Which is exactly what
      happens when a Linux guest runs with irqchip.gicv3_pseudo_nmi=1.
      In these condition, the performance plumets as we hit PMR each time
      we mask/unmask interrupts. Not good.
      
      There is however an opportunity for improvement. Careful reading
      of the architecture specification indicates that the only GICv3
      sysreg belonging to the common group (which contains the SGI
      registers, PMR, DIR, CTLR and RPR) that is allowed to generate
      a SError is DIR. Everything else is safe.
      
      It is thus possible to substitute the trapping of all the common
      group with just that of DIR if it supported by the implementation.
      Yes, that's yet another optional bit of the architecture.
      So let's just do that, as it leads to some impressive result on
      the M1:
      
      Without this change:
      	bash-5.1# /host/home/maz/hackbench 100 process 1000
      	Running with 100*40 (== 4000) tasks.
      	Time: 56.596
      
      With this change:
      	bash-5.1# /host/home/maz/hackbench 100 process 1000
      	Running with 100*40 (== 4000) tasks.
      	Time: 8.649
      
      which is a pretty convincing result.
      Signed-off-by: NMarc Zyngier <maz@kernel.org>
      Reviewed-by: NAlexandru Elisei <alexandru.elisei@arm.com>
      Link: https://lore.kernel.org/r/20211010150910.2911495-4-maz@kernel.org
      0924729b