1. 09 3月, 2018 1 次提交
    • A
      arm64/kernel: don't ban ADRP to work around Cortex-A53 erratum #843419 · a257e025
      Ard Biesheuvel 提交于
      Working around Cortex-A53 erratum #843419 involves special handling of
      ADRP instructions that end up in the last two instruction slots of a
      4k page, or whose output register gets overwritten without having been
      read. (Note that the latter instruction sequence is never emitted by
      a properly functioning compiler, which is why it is disregarded by the
      handling of the same erratum in the bfd.ld linker which we rely on for
      the core kernel)
      
      Normally, this gets taken care of by the linker, which can spot such
      sequences at final link time, and insert a veneer if the ADRP ends up
      at a vulnerable offset. However, linux kernel modules are partially
      linked ELF objects, and so there is no 'final link time' other than the
      runtime loading of the module, at which time all the static relocations
      are resolved.
      
      For this reason, we have implemented the #843419 workaround for modules
      by avoiding ADRP instructions altogether, by using the large C model,
      and by passing -mpc-relative-literal-loads to recent versions of GCC
      that may emit adrp/ldr pairs to perform literal loads. However, this
      workaround forces us to keep literal data mixed with the instructions
      in the executable .text segment, and literal data may inadvertently
      turn into an exploitable speculative gadget depending on the relative
      offsets of arbitrary symbols.
      
      So let's reimplement this workaround in a way that allows us to switch
      back to the small C model, and to drop the -mpc-relative-literal-loads
      GCC switch, by patching affected ADRP instructions at runtime:
      - ADRP instructions that do not appear at 4k relative offset 0xff8 or
        0xffc are ignored
      - ADRP instructions that are within 1 MB of their target symbol are
        converted into ADR instructions
      - remaining ADRP instructions are redirected via a veneer that performs
        the load using an unaffected movn/movk sequence.
      Signed-off-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
      [will: tidied up ADRP -> ADR instruction patching.]
      [will: use ULL suffix for 64-bit immediate]
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      a257e025
  2. 08 3月, 2018 2 次提交
    • A
      arm64/kernel: kaslr: reduce module randomization range to 4 GB · f2b9ba87
      Ard Biesheuvel 提交于
      We currently have to rely on the GCC large code model for KASLR for
      two distinct but related reasons:
      - if we enable full randomization, modules will be loaded very far away
        from the core kernel, where they are out of range for ADRP instructions,
      - even without full randomization, the fact that the 128 MB module region
        is now no longer fully reserved for kernel modules means that there is
        a very low likelihood that the normal bottom-up allocation of other
        vmalloc regions may collide, and use up the range for other things.
      
      Large model code is suboptimal, given that each symbol reference involves
      a literal load that goes through the D-cache, reducing cache utilization.
      But more importantly, literals are not instructions but part of .text
      nonetheless, and hence mapped with executable permissions.
      
      So let's get rid of our dependency on the large model for KASLR, by:
      - reducing the full randomization range to 4 GB, thereby ensuring that
        ADRP references between modules and the kernel are always in range,
      - reduce the spillover range to 4 GB as well, so that we fallback to a
        region that is still guaranteed to be in range
      - move the randomization window of the core kernel to the middle of the
        VMALLOC space
      
      Note that KASAN always uses the module region outside of the vmalloc space,
      so keep the kernel close to that if KASAN is enabled.
      Signed-off-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      f2b9ba87
    • A
      arm64: module: don't BUG when exceeding preallocated PLT count · 5e8307b9
      Ard Biesheuvel 提交于
      When PLTs are emitted at relocation time, we really should not exceed
      the number that we counted when parsing the relocation tables, and so
      currently, we BUG() on this condition. However, even though this is a
      clear bug in this particular piece of code, we can easily recover by
      failing to load the module.
      
      So instead, return 0 from module_emit_plt_entry() if this condition
      occurs, which is not a valid kernel address, and can hence serve as
      a flag value that makes the relocation routine bail out.
      Signed-off-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      5e8307b9
  3. 07 3月, 2018 9 次提交
  4. 05 3月, 2018 4 次提交
  5. 23 2月, 2018 1 次提交
    • P
      arm64: fix unwind_frame() for filtered out fn for function graph tracing · 9f416319
      Pratyush Anand 提交于
      do_task_stat() calls get_wchan(), which further does unwind_frame().
      unwind_frame() restores frame->pc to original value in case function
      graph tracer has modified a return address (LR) in a stack frame to hook
      a function return. However, if function graph tracer has hit a filtered
      function, then we can't unwind it as ftrace_push_return_trace() has
      biased the index(frame->graph) with a 'huge negative'
      offset(-FTRACE_NOTRACE_DEPTH).
      
      Moreover, arm64 stack walker defines index(frame->graph) as unsigned
      int, which can not compare a -ve number.
      
      Similar problem we can have with calling of walk_stackframe() from
      save_stack_trace_tsk() or dump_backtrace().
      
      This patch fixes unwind_frame() to test the index for -ve value and
      restore index accordingly before we can restore frame->pc.
      
      Reproducer:
      
      cd /sys/kernel/debug/tracing/
      echo schedule > set_graph_notrace
      echo 1 > options/display-graph
      echo wakeup > current_tracer
      ps -ef | grep -i agent
      
      Above commands result in:
      Unable to handle kernel paging request at virtual address ffff801bd3d1e000
      pgd = ffff8003cbe97c00
      [ffff801bd3d1e000] *pgd=0000000000000000, *pud=0000000000000000
      Internal error: Oops: 96000006 [#1] SMP
      [...]
      CPU: 5 PID: 11696 Comm: ps Not tainted 4.11.0+ #33
      [...]
      task: ffff8003c21ba000 task.stack: ffff8003cc6c0000
      PC is at unwind_frame+0x12c/0x180
      LR is at get_wchan+0xd4/0x134
      pc : [<ffff00000808892c>] lr : [<ffff0000080860b8>] pstate: 60000145
      sp : ffff8003cc6c3ab0
      x29: ffff8003cc6c3ab0 x28: 0000000000000001
      x27: 0000000000000026 x26: 0000000000000026
      x25: 00000000000012d8 x24: 0000000000000000
      x23: ffff8003c1c04000 x22: ffff000008c83000
      x21: ffff8003c1c00000 x20: 000000000000000f
      x19: ffff8003c1bc0000 x18: 0000fffffc593690
      x17: 0000000000000000 x16: 0000000000000001
      x15: 0000b855670e2b60 x14: 0003e97f22cf1d0f
      x13: 0000000000000001 x12: 0000000000000000
      x11: 00000000e8f4883e x10: 0000000154f47ec8
      x9 : 0000000070f367c0 x8 : 0000000000000000
      x7 : 00008003f7290000 x6 : 0000000000000018
      x5 : 0000000000000000 x4 : ffff8003c1c03cb0
      x3 : ffff8003c1c03ca0 x2 : 00000017ffe80000
      x1 : ffff8003cc6c3af8 x0 : ffff8003d3e9e000
      
      Process ps (pid: 11696, stack limit = 0xffff8003cc6c0000)
      Stack: (0xffff8003cc6c3ab0 to 0xffff8003cc6c4000)
      [...]
      [<ffff00000808892c>] unwind_frame+0x12c/0x180
      [<ffff000008305008>] do_task_stat+0x864/0x870
      [<ffff000008305c44>] proc_tgid_stat+0x3c/0x48
      [<ffff0000082fde0c>] proc_single_show+0x5c/0xb8
      [<ffff0000082b27e0>] seq_read+0x160/0x414
      [<ffff000008289e6c>] __vfs_read+0x58/0x164
      [<ffff00000828b164>] vfs_read+0x88/0x144
      [<ffff00000828c2e8>] SyS_read+0x60/0xc0
      [<ffff0000080834a0>] __sys_trace_return+0x0/0x4
      
      Fixes: 20380bb3 (arm64: ftrace: fix a stack tracer's output under function graph tracer)
      Signed-off-by: NPratyush Anand <panand@redhat.com>
      Signed-off-by: NJerome Marchand <jmarchan@redhat.com>
      [catalin.marinas@arm.com: replace WARN_ON with WARN_ON_ONCE]
      Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
      9f416319
  6. 22 2月, 2018 1 次提交
    • I
      treewide/trivial: Remove ';;$' typo noise · ed7158ba
      Ingo Molnar 提交于
      On lkml suggestions were made to split up such trivial typo fixes into per subsystem
      patches:
      
        --- a/arch/x86/boot/compressed/eboot.c
        +++ b/arch/x86/boot/compressed/eboot.c
        @@ -439,7 +439,7 @@ setup_uga32(void **uga_handle, unsigned long size, u32 *width, u32 *height)
                struct efi_uga_draw_protocol *uga = NULL, *first_uga;
                efi_guid_t uga_proto = EFI_UGA_PROTOCOL_GUID;
                unsigned long nr_ugas;
        -       u32 *handles = (u32 *)uga_handle;;
        +       u32 *handles = (u32 *)uga_handle;
                efi_status_t status = EFI_INVALID_PARAMETER;
                int i;
      
      This patch is the result of the following script:
      
        $ sed -i 's/;;$/;/g' $(git grep -E ';;$'  | grep "\.[ch]:"  | grep -vwE 'for|ia64' | cut -d: -f1 | sort | uniq)
      
      ... followed by manual review to make sure it's all good.
      
      Splitting this up is just crazy talk, let's get over with this and just do it.
      Reported-by: NPavel Machek <pavel@ucw.cz>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      ed7158ba
  7. 20 2月, 2018 5 次提交
    • M
      arm64: perf: correct PMUVer probing · 0331365e
      Mark Rutland 提交于
      The ID_AA64DFR0_EL1.PMUVer field doesn't follow the usual ID registers
      scheme. While value 0xf indicates a non-architected PMU is implemented,
      values 0x1 to 0xe indicate an increasingly featureful architected PMU,
      as if the field were unsigned.
      
      For more details, see ARM DDI 0487C.a, D10.1.4, "Alternative ID scheme
      used for the Performance Monitors Extension version".
      
      Currently, we treat the field as signed, and erroneously bail out for
      values 0x8 to 0xe. Let's correct that.
      Signed-off-by: NMark Rutland <mark.rutland@arm.com>
      Reviewed-by: NRobin Murphy <robin.murphy@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      0331365e
    • W
      arm64: __show_regs: Only resolve kernel symbols when running at EL1 · a06f818a
      Will Deacon 提交于
      __show_regs pretty prints PC and LR by attempting to map them to kernel
      function names to improve the utility of crash reports. Unfortunately,
      this mapping is applied even when the pt_regs corresponds to user mode,
      resulting in a KASLR oracle.
      
      Avoid this issue by only looking up the function symbols when the register
      state indicates that we're actually running at EL1.
      
      Cc: <stable@vger.kernel.org>
      Reported-by: NNCSC Security <security@ncsc.gov.uk>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      a06f818a
    • M
      arm64: Remove unimplemented syscall log message · 1962682d
      Michael Weiser 提交于
      Stop printing a (ratelimited) kernel message for each instance of an
      unimplemented syscall being called. Userland making an unimplemented
      syscall is not necessarily misbehaviour and to be expected with a
      current userland running on an older kernel. Also, the current message
      looks scary to users but does not actually indicate a real problem nor
      help them narrow down the cause. Just rely on sys_ni_syscall() to return
      -ENOSYS.
      
      Cc: <stable@vger.kernel.org>
      Acked-by: NWill Deacon <will.deacon@arm.com>
      Signed-off-by: NMichael Weiser <michael.weiser@gmx.de>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      1962682d
    • M
      arm64: Disable unhandled signal log messages by default · 5ee39a71
      Michael Weiser 提交于
      aarch64 unhandled signal kernel messages are very verbose, suggesting
      them to be more of a debugging aid:
      
      sigsegv[33]: unhandled level 2 translation fault (11) at 0x00000000, esr
      0x92000046, in sigsegv[400000+71000]
      CPU: 1 PID: 33 Comm: sigsegv Tainted: G        W        4.15.0-rc3+ #3
      Hardware name: linux,dummy-virt (DT)
      pstate: 60000000 (nZCv daif -PAN -UAO)
      pc : 0x4003f4
      lr : 0x4006bc
      sp : 0000fffffe94a060
      x29: 0000fffffe94a070 x28: 0000000000000000
      x27: 0000000000000000 x26: 0000000000000000
      x25: 0000000000000000 x24: 00000000004001b0
      x23: 0000000000486ac8 x22: 00000000004001c8
      x21: 0000000000000000 x20: 0000000000400be8
      x19: 0000000000400b30 x18: 0000000000484728
      x17: 000000000865ffc8 x16: 000000000000270f
      x15: 00000000000000b0 x14: 0000000000000002
      x13: 0000000000000001 x12: 0000000000000000
      x11: 0000000000000000 x10: 0008000020008008
      x9 : 000000000000000f x8 : ffffffffffffffff
      x7 : 0004000000000000 x6 : ffffffffffffffff
      x5 : 0000000000000000 x4 : 0000000000000000
      x3 : 00000000004003e4 x2 : 0000fffffe94a1e8
      x1 : 000000000000000a x0 : 0000000000000000
      
      Disable them by default, so they can be enabled using
      /proc/sys/debug/exception-trace.
      
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NMichael Weiser <michael.weiser@gmx.de>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      5ee39a71
    • W
      arm64: cpufeature: Fix CTR_EL0 field definitions · be68a8aa
      Will Deacon 提交于
      Our field definitions for CTR_EL0 suffer from a number of problems:
      
        - The IDC and DIC fields are missing, which causes us to enable CTR
          trapping on CPUs with either of these returning non-zero values.
      
        - The ERG is FTR_LOWER_SAFE, whereas it should be treated like CWG as
          FTR_HIGHER_SAFE so that applications can use it to avoid false sharing.
      
        - [nit] A RES1 field is described as "RAO"
      
      This patch updates the CTR_EL0 field definitions to fix these issues.
      
      Cc: <stable@vger.kernel.org>
      Cc: Shanker Donthineni <shankerd@codeaurora.org>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
      be68a8aa
  8. 19 2月, 2018 1 次提交
    • R
      arm64: uaccess: Formalise types for access_ok() · 9085b34d
      Robin Murphy 提交于
      In converting __range_ok() into a static inline, I inadvertently made
      it more type-safe, but without considering the ordering of the relevant
      conversions. This leads to quite a lot of Sparse noise about the fact
      that we use __chk_user_ptr() after addr has already been converted from
      a user pointer to an unsigned long.
      
      Rather than just adding another cast for the sake of shutting Sparse up,
      it seems reasonable to rework the types to make logical sense (although
      the resulting codegen for __range_ok() remains identical). The only
      callers this affects directly are our compat traps where the inferred
      "user-pointer-ness" of a register value now warrants explicit casting.
      Signed-off-by: NRobin Murphy <robin.murphy@arm.com>
      Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
      9085b34d
  9. 17 2月, 2018 1 次提交
    • W
      arm64: mm: Use READ_ONCE/WRITE_ONCE when accessing page tables · 20a004e7
      Will Deacon 提交于
      In many cases, page tables can be accessed concurrently by either another
      CPU (due to things like fast gup) or by the hardware page table walker
      itself, which may set access/dirty bits. In such cases, it is important
      to use READ_ONCE/WRITE_ONCE when accessing page table entries so that
      entries cannot be torn, merged or subject to apparent loss of coherence
      due to compiler transformations.
      
      Whilst there are some scenarios where this cannot happen (e.g. pinned
      kernel mappings for the linear region), the overhead of using READ_ONCE
      /WRITE_ONCE everywhere is minimal and makes the code an awful lot easier
      to reason about. This patch consistently uses these macros in the arch
      code, as well as explicitly namespacing pointers to page table entries
      from the entries themselves by using adopting a 'p' suffix for the former
      (as is sometimes used elsewhere in the kernel source).
      Tested-by: NYury Norov <ynorov@caviumnetworks.com>
      Tested-by: NRichard Ruigrok <rruigrok@codeaurora.org>
      Reviewed-by: NMarc Zyngier <marc.zyngier@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
      20a004e7
  10. 12 2月, 2018 1 次提交
  11. 07 2月, 2018 14 次提交