1. 02 9月, 2020 4 次提交
  2. 21 11月, 2019 1 次提交
  3. 17 4月, 2019 1 次提交
  4. 14 11月, 2018 1 次提交
  5. 12 7月, 2018 2 次提交
  6. 25 4月, 2018 2 次提交
    • E
      signal: Ensure every siginfo we send has all bits initialized · 3eb0f519
      Eric W. Biederman 提交于
      Call clear_siginfo to ensure every stack allocated siginfo is properly
      initialized before being passed to the signal sending functions.
      
      Note: It is not safe to depend on C initializers to initialize struct
      siginfo on the stack because C is allowed to skip holes when
      initializing a structure.
      
      The initialization of struct siginfo in tracehook_report_syscall_exit
      was moved from the helper user_single_step_siginfo into
      tracehook_report_syscall_exit itself, to make it clear that the local
      variable siginfo gets fully initialized.
      
      In a few cases the scope of struct siginfo has been reduced to make it
      clear that siginfo siginfo is not used on other paths in the function
      in which it is declared.
      
      Instances of using memset to initialize siginfo have been replaced
      with calls clear_siginfo for clarity.
      Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>
      3eb0f519
    • M
      arm64: only advance singlestep for user instruction traps · 9478f192
      Mark Rutland 提交于
      Our arm64_skip_faulting_instruction() helper advances the userspace
      singlestep state machine, but this is also called by the kernel BRK
      handler, as used for WARN*().
      
      Thus, if we happen to hit a WARN*() while the user singlestep state
      machine is in the active-no-pending state, we'll advance to the
      active-pending state without having executed a user instruction, and
      will take a step exception earlier than expected when we return to
      userspace.
      
      Let's fix this by only advancing the state machine when skipping a user
      instruction.
      Signed-off-by: NMark Rutland <mark.rutland@arm.com>
      Cc: Andrey Konovalov <andreyknvl@google.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      9478f192
  7. 18 4月, 2018 1 次提交
  8. 27 3月, 2018 1 次提交
    • D
      arm64: capabilities: Update prototype for enable call back · c0cda3b8
      Dave Martin 提交于
      We issue the enable() call back for all CPU hwcaps capabilities
      available on the system, on all the CPUs. So far we have ignored
      the argument passed to the call back, which had a prototype to
      accept a "void *" for use with on_each_cpu() and later with
      stop_machine(). However, with commit 0a0d111d
      ("arm64: cpufeature: Pass capability structure to ->enable callback"),
      there are some users of the argument who wants the matching capability
      struct pointer where there are multiple matching criteria for a single
      capability. Clean up the declaration of the call back to make it clear.
      
       1) Renamed to cpu_enable(), to imply taking necessary actions on the
          called CPU for the entry.
       2) Pass const pointer to the capability, to allow the call back to
          check the entry. (e.,g to check if any action is needed on the CPU)
       3) We don't care about the result of the call back, turning this to
          a void.
      
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Andre Przywara <andre.przywara@arm.com>
      Cc: James Morse <james.morse@arm.com>
      Acked-by: NRobin Murphy <robin.murphy@arm.com>
      Reviewed-by: NJulien Thierry <julien.thierry@arm.com>
      Signed-off-by: NDave Martin <dave.martin@arm.com>
      [suzuki: convert more users, rename call back and drop results]
      Signed-off-by: NSuzuki K Poulose <suzuki.poulose@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      c0cda3b8
  9. 07 3月, 2018 6 次提交
  10. 20 2月, 2018 2 次提交
    • M
      arm64: Remove unimplemented syscall log message · 1962682d
      Michael Weiser 提交于
      Stop printing a (ratelimited) kernel message for each instance of an
      unimplemented syscall being called. Userland making an unimplemented
      syscall is not necessarily misbehaviour and to be expected with a
      current userland running on an older kernel. Also, the current message
      looks scary to users but does not actually indicate a real problem nor
      help them narrow down the cause. Just rely on sys_ni_syscall() to return
      -ENOSYS.
      
      Cc: <stable@vger.kernel.org>
      Acked-by: NWill Deacon <will.deacon@arm.com>
      Signed-off-by: NMichael Weiser <michael.weiser@gmx.de>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      1962682d
    • M
      arm64: Disable unhandled signal log messages by default · 5ee39a71
      Michael Weiser 提交于
      aarch64 unhandled signal kernel messages are very verbose, suggesting
      them to be more of a debugging aid:
      
      sigsegv[33]: unhandled level 2 translation fault (11) at 0x00000000, esr
      0x92000046, in sigsegv[400000+71000]
      CPU: 1 PID: 33 Comm: sigsegv Tainted: G        W        4.15.0-rc3+ #3
      Hardware name: linux,dummy-virt (DT)
      pstate: 60000000 (nZCv daif -PAN -UAO)
      pc : 0x4003f4
      lr : 0x4006bc
      sp : 0000fffffe94a060
      x29: 0000fffffe94a070 x28: 0000000000000000
      x27: 0000000000000000 x26: 0000000000000000
      x25: 0000000000000000 x24: 00000000004001b0
      x23: 0000000000486ac8 x22: 00000000004001c8
      x21: 0000000000000000 x20: 0000000000400be8
      x19: 0000000000400b30 x18: 0000000000484728
      x17: 000000000865ffc8 x16: 000000000000270f
      x15: 00000000000000b0 x14: 0000000000000002
      x13: 0000000000000001 x12: 0000000000000000
      x11: 0000000000000000 x10: 0008000020008008
      x9 : 000000000000000f x8 : ffffffffffffffff
      x7 : 0004000000000000 x6 : ffffffffffffffff
      x5 : 0000000000000000 x4 : 0000000000000000
      x3 : 00000000004003e4 x2 : 0000fffffe94a1e8
      x1 : 000000000000000a x0 : 0000000000000000
      
      Disable them by default, so they can be enabled using
      /proc/sys/debug/exception-trace.
      
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NMichael Weiser <michael.weiser@gmx.de>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      5ee39a71
  11. 16 1月, 2018 1 次提交
    • J
      arm64: kernel: Survive corrected RAS errors notified by SError · 6bf0dcfd
      James Morse 提交于
      Prior to v8.2, SError is an uncontainable fatal exception. The v8.2 RAS
      extensions use SError to notify software about RAS errors, these can be
      contained by the Error Syncronization Barrier.
      
      An ACPI system with firmware-first may use SError as its 'SEI'
      notification. Future patches may add code to 'claim' this SError as a
      notification.
      
      Other systems can distinguish these RAS errors from the SError ESR and
      use the AET bits and additional data from RAS-Error registers to handle
      the error. Future patches may add this kernel-first handling.
      
      Without support for either of these we will panic(), even if we received
      a corrected error. Add code to decode the severity of RAS errors. We can
      safely ignore contained errors where the CPU can continue to make
      progress. For all other errors we continue to panic().
      Signed-off-by: NJames Morse <james.morse@arm.com>
      Reviewed-by: NCatalin Marinas <catalin.marinas@arm.com>
      Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
      6bf0dcfd
  12. 03 11月, 2017 3 次提交
    • D
      arm64/sve: Core task context handling · bc0ee476
      Dave Martin 提交于
      This patch adds the core support for switching and managing the SVE
      architectural state of user tasks.
      
      Calls to the existing FPSIMD low-level save/restore functions are
      factored out as new functions task_fpsimd_{save,load}(), since SVE
      now dynamically may or may not need to be handled at these points
      depending on the kernel configuration, hardware features discovered
      at boot, and the runtime state of the task.  To make these
      decisions as fast as possible, const cpucaps are used where
      feasible, via the system_supports_sve() helper.
      
      The SVE registers are only tracked for threads that have explicitly
      used SVE, indicated by the new thread flag TIF_SVE.  Otherwise, the
      FPSIMD view of the architectural state is stored in
      thread.fpsimd_state as usual.
      
      When in use, the SVE registers are not stored directly in
      thread_struct due to their potentially large and variable size.
      Because the task_struct slab allocator must be configured very
      early during kernel boot, it is also tricky to configure it
      correctly to match the maximum vector length provided by the
      hardware, since this depends on examining secondary CPUs as well as
      the primary.  Instead, a pointer sve_state in thread_struct points
      to a dynamically allocated buffer containing the SVE register data,
      and code is added to allocate and free this buffer at appropriate
      times.
      
      TIF_SVE is set when taking an SVE access trap from userspace, if
      suitable hardware support has been detected.  This enables SVE for
      the thread: a subsequent return to userspace will disable the trap
      accordingly.  If such a trap is taken without sufficient system-
      wide hardware support, SIGILL is sent to the thread instead as if
      an undefined instruction had been executed: this may happen if
      userspace tries to use SVE in a system where not all CPUs support
      it for example.
      
      The kernel will clear TIF_SVE and disable SVE for the thread
      whenever an explicit syscall is made by userspace.  For backwards
      compatibility reasons and conformance with the spirit of the base
      AArch64 procedure call standard, the subset of the SVE register
      state that aliases the FPSIMD registers is still preserved across a
      syscall even if this happens.  The remainder of the SVE register
      state logically becomes zero at syscall entry, though the actual
      zeroing work is currently deferred until the thread next tries to
      use SVE, causing another trap to the kernel.  This implementation
      is suboptimal: in the future, the fastpath case may be optimised
      to zero the registers in-place and leave SVE enabled for the task,
      where beneficial.
      
      TIF_SVE is also cleared in the following slowpath cases, which are
      taken as reasonable hints that the task may no longer use SVE:
       * exec
       * fork and clone
      
      Code is added to sync data between thread.fpsimd_state and
      thread.sve_state whenever enabling/disabling SVE, in a manner
      consistent with the SVE architectural programmer's model.
      Signed-off-by: NDave Martin <Dave.Martin@arm.com>
      Reviewed-by: NCatalin Marinas <catalin.marinas@arm.com>
      Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
      Cc: Alex Bennée <alex.bennee@linaro.org>
      [will: added #include to fix allnoconfig build]
      [will: use enable_daif in do_sve_acc]
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      bc0ee476
    • D
      arm64/sve: System register and exception syndrome definitions · 67236564
      Dave Martin 提交于
      The SVE architecture adds some system registers, ID register fields
      and a dedicated ESR exception class.
      
      This patch adds the appropriate definitions that will be needed by
      the kernel.
      Signed-off-by: NDave Martin <Dave.Martin@arm.com>
      Reviewed-by: NAlex Bennée <alex.bennee@linaro.org>
      Reviewed-by: NCatalin Marinas <catalin.marinas@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      67236564
    • M
      arm64: ensure __dump_instr() checks addr_limit · 7a7003b1
      Mark Rutland 提交于
      It's possible for a user to deliberately trigger __dump_instr with a
      chosen kernel address.
      
      Let's avoid problems resulting from this by using get_user() rather than
      __get_user(), ensuring that we don't erroneously access kernel memory.
      
      Where we use __dump_instr() on kernel text, we already switch to
      KERNEL_DS, so this shouldn't adversely affect those cases.
      
      Fixes: 60ffc30d ("arm64: Exception handling")
      Cc: stable@vger.kernel.org
      Acked-by: NWill Deacon <will.deacon@arm.com>
      Signed-off-by: NMark Rutland <mark.rutland@arm.com>
      Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
      7a7003b1
  13. 02 11月, 2017 2 次提交
  14. 27 10月, 2017 1 次提交
  15. 25 10月, 2017 1 次提交
  16. 16 8月, 2017 3 次提交
    • M
      arm64: add VMAP_STACK overflow detection · 872d8327
      Mark Rutland 提交于
      This patch adds stack overflow detection to arm64, usable when vmap'd stacks
      are in use.
      
      Overflow is detected in a small preamble executed for each exception entry,
      which checks whether there is enough space on the current stack for the general
      purpose registers to be saved. If there is not enough space, the overflow
      handler is invoked on a per-cpu overflow stack. This approach preserves the
      original exception information in ESR_EL1 (and where appropriate, FAR_EL1).
      
      Task and IRQ stacks are aligned to double their size, enabling overflow to be
      detected with a single bit test. For example, a 16K stack is aligned to 32K,
      ensuring that bit 14 of the SP must be zero. On an overflow (or underflow),
      this bit is flipped. Thus, overflow (of less than the size of the stack) can be
      detected by testing whether this bit is set.
      
      The overflow check is performed before any attempt is made to access the
      stack, avoiding recursive faults (and the loss of exception information
      these would entail). As logical operations cannot be performed on the SP
      directly, the SP is temporarily swapped with a general purpose register
      using arithmetic operations to enable the test to be performed.
      
      This gives us a useful error message on stack overflow, as can be trigger with
      the LKDTM overflow test:
      
      [  305.388749] lkdtm: Performing direct entry OVERFLOW
      [  305.395444] Insufficient stack space to handle exception!
      [  305.395482] ESR: 0x96000047 -- DABT (current EL)
      [  305.399890] FAR: 0xffff00000a5e7f30
      [  305.401315] Task stack:     [0xffff00000a5e8000..0xffff00000a5ec000]
      [  305.403815] IRQ stack:      [0xffff000008000000..0xffff000008004000]
      [  305.407035] Overflow stack: [0xffff80003efce4e0..0xffff80003efcf4e0]
      [  305.409622] CPU: 0 PID: 1219 Comm: sh Not tainted 4.13.0-rc3-00021-g9636aea #5
      [  305.412785] Hardware name: linux,dummy-virt (DT)
      [  305.415756] task: ffff80003d051c00 task.stack: ffff00000a5e8000
      [  305.419221] PC is at recursive_loop+0x10/0x48
      [  305.421637] LR is at recursive_loop+0x38/0x48
      [  305.423768] pc : [<ffff00000859f330>] lr : [<ffff00000859f358>] pstate: 40000145
      [  305.428020] sp : ffff00000a5e7f50
      [  305.430469] x29: ffff00000a5e8350 x28: ffff80003d051c00
      [  305.433191] x27: ffff000008981000 x26: ffff000008f80400
      [  305.439012] x25: ffff00000a5ebeb8 x24: ffff00000a5ebeb8
      [  305.440369] x23: ffff000008f80138 x22: 0000000000000009
      [  305.442241] x21: ffff80003ce65000 x20: ffff000008f80188
      [  305.444552] x19: 0000000000000013 x18: 0000000000000006
      [  305.446032] x17: 0000ffffa2601280 x16: ffff0000081fe0b8
      [  305.448252] x15: ffff000008ff546d x14: 000000000047a4c8
      [  305.450246] x13: ffff000008ff7872 x12: 0000000005f5e0ff
      [  305.452953] x11: ffff000008ed2548 x10: 000000000005ee8d
      [  305.454824] x9 : ffff000008545380 x8 : ffff00000a5e8770
      [  305.457105] x7 : 1313131313131313 x6 : 00000000000000e1
      [  305.459285] x5 : 0000000000000000 x4 : 0000000000000000
      [  305.461781] x3 : 0000000000000000 x2 : 0000000000000400
      [  305.465119] x1 : 0000000000000013 x0 : 0000000000000012
      [  305.467724] Kernel panic - not syncing: kernel stack overflow
      [  305.470561] CPU: 0 PID: 1219 Comm: sh Not tainted 4.13.0-rc3-00021-g9636aea #5
      [  305.473325] Hardware name: linux,dummy-virt (DT)
      [  305.475070] Call trace:
      [  305.476116] [<ffff000008088ad8>] dump_backtrace+0x0/0x378
      [  305.478991] [<ffff000008088e64>] show_stack+0x14/0x20
      [  305.481237] [<ffff00000895a178>] dump_stack+0x98/0xb8
      [  305.483294] [<ffff0000080c3288>] panic+0x118/0x280
      [  305.485673] [<ffff0000080c2e9c>] nmi_panic+0x6c/0x70
      [  305.486216] [<ffff000008089710>] handle_bad_stack+0x118/0x128
      [  305.486612] Exception stack(0xffff80003efcf3a0 to 0xffff80003efcf4e0)
      [  305.487334] f3a0: 0000000000000012 0000000000000013 0000000000000400 0000000000000000
      [  305.488025] f3c0: 0000000000000000 0000000000000000 00000000000000e1 1313131313131313
      [  305.488908] f3e0: ffff00000a5e8770 ffff000008545380 000000000005ee8d ffff000008ed2548
      [  305.489403] f400: 0000000005f5e0ff ffff000008ff7872 000000000047a4c8 ffff000008ff546d
      [  305.489759] f420: ffff0000081fe0b8 0000ffffa2601280 0000000000000006 0000000000000013
      [  305.490256] f440: ffff000008f80188 ffff80003ce65000 0000000000000009 ffff000008f80138
      [  305.490683] f460: ffff00000a5ebeb8 ffff00000a5ebeb8 ffff000008f80400 ffff000008981000
      [  305.491051] f480: ffff80003d051c00 ffff00000a5e8350 ffff00000859f358 ffff00000a5e7f50
      [  305.491444] f4a0: ffff00000859f330 0000000040000145 0000000000000000 0000000000000000
      [  305.492008] f4c0: 0001000000000000 0000000000000000 ffff00000a5e8350 ffff00000859f330
      [  305.493063] [<ffff00000808205c>] __bad_stack+0x88/0x8c
      [  305.493396] [<ffff00000859f330>] recursive_loop+0x10/0x48
      [  305.493731] [<ffff00000859f358>] recursive_loop+0x38/0x48
      [  305.494088] [<ffff00000859f358>] recursive_loop+0x38/0x48
      [  305.494425] [<ffff00000859f358>] recursive_loop+0x38/0x48
      [  305.494649] [<ffff00000859f358>] recursive_loop+0x38/0x48
      [  305.494898] [<ffff00000859f358>] recursive_loop+0x38/0x48
      [  305.495205] [<ffff00000859f358>] recursive_loop+0x38/0x48
      [  305.495453] [<ffff00000859f358>] recursive_loop+0x38/0x48
      [  305.495708] [<ffff00000859f358>] recursive_loop+0x38/0x48
      [  305.496000] [<ffff00000859f358>] recursive_loop+0x38/0x48
      [  305.496302] [<ffff00000859f358>] recursive_loop+0x38/0x48
      [  305.496644] [<ffff00000859f358>] recursive_loop+0x38/0x48
      [  305.496894] [<ffff00000859f358>] recursive_loop+0x38/0x48
      [  305.497138] [<ffff00000859f358>] recursive_loop+0x38/0x48
      [  305.497325] [<ffff00000859f3dc>] lkdtm_OVERFLOW+0x14/0x20
      [  305.497506] [<ffff00000859f314>] lkdtm_do_action+0x1c/0x28
      [  305.497786] [<ffff00000859f178>] direct_entry+0xe0/0x170
      [  305.498095] [<ffff000008345568>] full_proxy_write+0x60/0xa8
      [  305.498387] [<ffff0000081fb7f4>] __vfs_write+0x1c/0x128
      [  305.498679] [<ffff0000081fcc68>] vfs_write+0xa0/0x1b0
      [  305.498926] [<ffff0000081fe0fc>] SyS_write+0x44/0xa0
      [  305.499182] Exception stack(0xffff00000a5ebec0 to 0xffff00000a5ec000)
      [  305.499429] bec0: 0000000000000001 000000001c4cf5e0 0000000000000009 000000001c4cf5e0
      [  305.499674] bee0: 574f4c465245564f 0000000000000000 0000000000000000 8000000080808080
      [  305.499904] bf00: 0000000000000040 0000000000000038 fefefeff1b4bc2ff 7f7f7f7f7f7fff7f
      [  305.500189] bf20: 0101010101010101 0000000000000000 000000000047a4c8 0000000000000038
      [  305.500712] bf40: 0000000000000000 0000ffffa2601280 0000ffffc63f6068 00000000004b5000
      [  305.501241] bf60: 0000000000000001 000000001c4cf5e0 0000000000000009 000000001c4cf5e0
      [  305.501791] bf80: 0000000000000020 0000000000000000 00000000004b5000 000000001c4cc458
      [  305.502314] bfa0: 0000000000000000 0000ffffc63f7950 000000000040a3c4 0000ffffc63f70e0
      [  305.502762] bfc0: 0000ffffa2601268 0000000080000000 0000000000000001 0000000000000040
      [  305.503207] bfe0: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
      [  305.503680] [<ffff000008082fb0>] el0_svc_naked+0x24/0x28
      [  305.504720] Kernel Offset: disabled
      [  305.505189] CPU features: 0x002082
      [  305.505473] Memory Limit: none
      [  305.506181] ---[ end Kernel panic - not syncing: kernel stack overflow
      
      This patch was co-authored by Ard Biesheuvel and Mark Rutland.
      Signed-off-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
      Signed-off-by: NMark Rutland <mark.rutland@arm.com>
      Reviewed-by: NWill Deacon <will.deacon@arm.com>
      Tested-by: NLaura Abbott <labbott@redhat.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: James Morse <james.morse@arm.com>
      872d8327
    • M
      arm64: add on_accessible_stack() · 12964443
      Mark Rutland 提交于
      Both unwind_frame() and dump_backtrace() try to check whether a stack
      address is sane to access, with very similar logic. Both will need
      updating in order to handle overflow stacks.
      
      Factor out this logic into a helper, so that we can avoid further
      duplication when we add overflow stacks.
      Signed-off-by: NMark Rutland <mark.rutland@arm.com>
      Reviewed-by: NWill Deacon <will.deacon@arm.com>
      Tested-by: NLaura Abbott <labbott@redhat.com>
      Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: James Morse <james.morse@arm.com>
      12964443
    • M
      arm64: remove __die()'s stack dump · c5bc503c
      Mark Rutland 提交于
      Our __die() implementation tries to dump the stack memory, in addition
      to a backtrace, which is problematic.
      
      For contemporary 16K stacks, this can be a lot of data, which can take a
      long time to dump, and can push other useful context out of the kernel's
      printk ringbuffer (and/or a user's scrollback buffer on an attached
      console).
      
      Additionally, the code implicitly assumes that the SP is on the task's
      stack, and tries to dump everything between the SP and the highest task
      stack address. When the SP points at an IRQ stack (or is corrupted),
      this makes the kernel attempt to dump vast amounts of VA space. With
      vmap'd stacks, this may result in erroneous accesses to peripherals.
      
      This patch removes the memory dump, leaving us to rely on the backtrace,
      and other means of dumping stack memory such as kdump.
      Signed-off-by: NMark Rutland <mark.rutland@arm.com>
      Reviewed-by: NWill Deacon <will.deacon@arm.com>
      Tested-by: NLaura Abbott <labbott@redhat.com>
      Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: James Morse <james.morse@arm.com>
      c5bc503c
  17. 09 8月, 2017 3 次提交
    • A
      arm64: unwind: remove sp from struct stackframe · 31e43ad3
      Ard Biesheuvel 提交于
      The unwind code sets the sp member of struct stackframe to
      'frame pointer + 0x10' unconditionally, without regard for whether
      doing so produces a legal value. So let's simply remove it now that
      we have stopped using it anyway.
      Signed-off-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
      Signed-off-by: NMark Rutland <mark.rutland@arm.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: James Morse <james.morse@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      31e43ad3
    • A
      arm64: unwind: reference pt_regs via embedded stack frame · 73267498
      Ard Biesheuvel 提交于
      As it turns out, the unwind code is slightly broken, and probably has
      been for a while. The problem is in the dumping of the exception stack,
      which is intended to dump the contents of the pt_regs struct at each
      level in the call stack where an exception was taken and routed to a
      routine marked as __exception (which means its stack frame is right
      below the pt_regs struct on the stack).
      
      'Right below the pt_regs struct' is ill defined, though: the unwind
      code assigns 'frame pointer + 0x10' to the .sp member of the stackframe
      struct at each level, and dump_backtrace() happily dereferences that as
      the pt_regs pointer when encountering an __exception routine. However,
      the actual size of the stack frame created by this routine (which could
      be one of many __exception routines we have in the kernel) is not known,
      and so frame.sp is pretty useless to figure out where struct pt_regs
      really is.
      
      So it seems the only way to ensure that we can find our struct pt_regs
      when walking the stack frames is to put it at a known fixed offset of
      the stack frame pointer that is passed to such __exception routines.
      The simplest way to do that is to put it inside pt_regs itself, which is
      the main change implemented by this patch. As a bonus, doing this allows
      us to get rid of a fair amount of cruft related to walking from one stack
      to the other, which is especially nice since we intend to introduce yet
      another stack for overflow handling once we add support for vmapped
      stacks. It also fixes an inconsistency where we only add a stack frame
      pointing to ELR_EL1 if we are executing from the IRQ stack but not when
      we are executing from the task stack.
      
      To consistly identify exceptions regs even in the presence of exceptions
      taken from entry code, we must check whether the next frame was created
      by entry text, rather than whether the current frame was crated by
      exception text.
      
      To avoid backtracing using PCs that fall in the idmap, or are controlled
      by userspace, we must explcitly zero the FP and LR in startup paths, and
      must ensure that the frame embedded in pt_regs is zeroed upon entry from
      EL0. To avoid these NULL entries showin in the backtrace, unwind_frame()
      is updated to avoid them.
      Signed-off-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
      [Mark: compare current frame against .entry.text, avoid bogus PCs]
      Signed-off-by: NMark Rutland <mark.rutland@arm.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: James Morse <james.morse@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      73267498
    • R
      arm64: Handle trapped DC CVAP · e1bc5d1b
      Robin Murphy 提交于
      Cache clean to PoP is subject to the same access controls as to PoC, so
      if we are trapping userspace cache maintenance with SCTLR_EL1.UCI, we
      need to be prepared to handle it. To avoid getting into complicated
      fights with binutils about ARMv8.2 options, we'll just cheat and use the
      raw SYS instruction rather than the 'proper' DC alias.
      Reviewed-by: NWill Deacon <will.deacon@arm.com>
      Signed-off-by: NRobin Murphy <robin.murphy@arm.com>
      Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
      e1bc5d1b
  18. 08 8月, 2017 1 次提交
    • M
      arm64: unwind: avoid percpu indirection for irq stack · 09668372
      Mark Rutland 提交于
      Our IRQ_STACK_PTR() and on_irq_stack() helpers both take a cpu argument,
      used to generate a percpu address. In all cases, they are passed
      {raw_,}smp_processor_id(), so this parameter is redundant.
      
      Since {raw_,}smp_processor_id() use a percpu variable internally, this
      approach means we generate a percpu offset to find the current cpu, then
      use this to index an array of percpu offsets, which we then use to find
      the current CPU's IRQ stack pointer. Thus, most of the work is
      redundant.
      
      Instead, we can consistently use raw_cpu_ptr() to generate the CPU's
      irq_stack pointer by simply adding the percpu offset to the irq_stack
      address, which is simpler in both respects.
      Signed-off-by: NMark Rutland <mark.rutland@arm.com>
      Signed-off-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: James Morse <james.morse@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      09668372
  19. 07 8月, 2017 1 次提交
    • D
      arm64: syscallno is secretly an int, make it official · 35d0e6fb
      Dave Martin 提交于
      The upper 32 bits of the syscallno field in thread_struct are
      handled inconsistently, being sometimes zero extended and sometimes
      sign-extended.  In fact, only the lower 32 bits seem to have any
      real significance for the behaviour of the code: it's been OK to
      handle the upper bits inconsistently because they don't matter.
      
      Currently, the only place I can find where those bits are
      significant is in calling trace_sys_enter(), which may be
      unintentional: for example, if a compat tracer attempts to cancel a
      syscall by passing -1 to (COMPAT_)PTRACE_SET_SYSCALL at the
      syscall-enter-stop, it will be traced as syscall 4294967295
      rather than -1 as might be expected (and as occurs for a native
      tracer doing the same thing).  Elsewhere, reads of syscallno cast
      it to an int or truncate it.
      
      There's also a conspicuous amount of code and casting to bodge
      around the fact that although semantically an int, syscallno is
      stored as a u64.
      
      Let's not pretend any more.
      
      In order to preserve the stp x instruction that stores the syscall
      number in entry.S, this patch special-cases the layout of struct
      pt_regs for big endian so that the newly 32-bit syscallno field
      maps onto the low bits of the stored value.  This is not beautiful,
      but benchmarking of the getpid syscall on Juno suggests indicates a
      minor slowdown if the stp is split into an stp x and stp w.
      Signed-off-by: NDave Martin <Dave.Martin@arm.com>
      Acked-by: NWill Deacon <will.deacon@arm.com>
      Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
      35d0e6fb
  20. 01 8月, 2017 1 次提交
  21. 20 7月, 2017 1 次提交
    • Q
      arm64: traps: disable irq in die() · 6f44a0ba
      Qiao Zhou 提交于
      In current die(), the irq is disabled for __die() handle, not
      including the possible panic() handling. Since the log in __die()
      can take several hundreds ms, new irq might come and interrupt
      current die().
      
      If the process calling die() holds some critical resource, and some
      other process scheduled later also needs it, then it would deadlock.
      The first panic will not be executed.
      
      So here disable irq for the whole flow of die().
      Signed-off-by: NQiao Zhou <qiaozhou@asrmicro.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      6f44a0ba
  22. 29 6月, 2017 1 次提交