1. 21 3月, 2016 1 次提交
    • A
      arm64/kernel: fix incorrect EL0 check in inv_entry macro · b660950c
      Ard Biesheuvel 提交于
      The implementation of macro inv_entry refers to its 'el' argument without
      the required leading backslash, which results in an undefined symbol
      'el' to be passed into the kernel_entry macro rather than the index of
      the exception level as intended.
      
      This undefined symbol strangely enough does not result in build failures,
      although it is visible in vmlinux:
      
           $ nm -n vmlinux |head
                            U el
           0000000000000000 A _kernel_flags_le_hi32
           0000000000000000 A _kernel_offset_le_hi32
           0000000000000000 A _kernel_size_le_hi32
           000000000000000a A _kernel_flags_le_lo32
           .....
      
      However, it does result in incorrect code being generated for invalid
      exceptions taken from EL0, since the argument check in kernel_entry
      assumes EL1 if its argument does not equal '0'.
      Signed-off-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
      Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
      b660950c
  2. 06 1月, 2016 1 次提交
    • M
      arm64: entry: remove pointless SPSR mode check · ee03353b
      Mark Rutland 提交于
      In work_pending, we may skip work if the stacked SPSR value represents
      anything other than an EL0 context. We then immediately invoke the
      kernel_exit 0 macro as part of ret_to_user, assuming a return to EL0.
      This is somewhat confusing.
      
      We use work_pending as part of the ret_to_user/ret_fast_syscall state
      machine. We only use ret_fast_syscall in the return from an SVC issued
      from EL0. We use ret_to_user for return from EL0 exception handlers and
      also for return from ret_from_fork in the case the task was not a kernel
      thread (i.e. it is a user task).
      
      Thus in all cases the stacked SPSR value must represent an EL0 context,
      and the check is redundant. This patch removes it, along with the now
      unused no_work_pending label.
      
      Cc: Chris Metcalf <cmetcalf@ezchip.com>
      Acked-by: NCatalin Marinas <catalin.marinas@arm.com>
      Signed-off-by: NMark Rutland <mark.rutland@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      ee03353b
  3. 22 12月, 2015 1 次提交
    • J
      arm64: remove irq_count and do_softirq_own_stack() · d224a69e
      James Morse 提交于
      sysrq_handle_reboot() re-enables interrupts while on the irq stack. The
      irq_stack implementation wrongly assumed this would only ever happen
      via the softirq path, allowing it to update irq_count late, in
      do_softirq_own_stack().
      
      This means if an irq occurs in sysrq_handle_reboot(), during
      emergency_restart() the stack will be corrupted, as irq_count wasn't
      updated.
      
      Lose the optimisation, and instead of moving the adding/subtracting of
      irq_count into irq_stack_entry/irq_stack_exit, remove it, and compare
      sp_el0 (struct thread_info) with sp & ~(THREAD_SIZE - 1). This tells us
      if we are on a task stack, if so, we can safely switch to the irq stack.
      Finally, remove do_softirq_own_stack(), we don't need it anymore.
      Reported-by: NWill Deacon <will.deacon@arm.com>
      Signed-off-by: NJames Morse <james.morse@arm.com>
      [will: use get_thread_info macro]
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      d224a69e
  4. 16 12月, 2015 1 次提交
    • J
      arm64: reduce stack use in irq_handler · 971c67ce
      James Morse 提交于
      The code for switching to irq_stack stores three pieces of information on
      the stack, fp+lr, as a fake stack frame (that lets us walk back onto the
      interrupted tasks stack frame), and the address of the struct pt_regs that
      contains the register values from kernel entry. (which dump_backtrace()
      will print in any stack trace).
      
      To reduce this, we store fp, and the pointer to the struct pt_regs.
      unwind_frame() can recognise this as the irq_stack dummy frame, (as it only
      appears at the top of the irq_stack), and use the struct pt_regs values
      to find the missing interrupted link-register.
      Suggested-by: NWill Deacon <will.deacon@arm.com>
      Signed-off-by: NJames Morse <james.morse@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      971c67ce
  5. 10 12月, 2015 2 次提交
  6. 09 12月, 2015 1 次提交
    • W
      arm64: irq: fix walking from irq stack to task stack · 7596abf2
      Will Deacon 提交于
      Running with CONFIG_DEBUG_SPINLOCK=y can trigger a BUG with the new IRQ
      stack code:
      
        BUG: spinlock lockup suspected on CPU#1
      
      This is due to the IRQ_STACK_TO_TASK_STACK macro incorrectly retrieving
      the task stack pointer stashed at the top of the IRQ stack.
      
      Sayeth James:
      
      | Yup, this is what is happening. Its an off-by-one due to broken
      | thinking about how the stack works. My broken thinking was:
      |
      | >   top ------------
      | >       | dummy_lr | <- irq_stack_ptr
      | >       ------------
      | >       |   x29    |
      | >       ------------
      | >       |   x19    | <- irq_stack_ptr - 0x10
      | >       ------------
      | >       |   xzr    |
      | >       ------------
      |
      | But the stack-pointer is decreased before use. So it actually looks
      | like this:
      |
      | >       ------------
      | >       |          |  <- irq_stack_ptr
      | >   top ------------
      | >       | dummy_lr |
      | >       ------------
      | >       |   x29    | <- irq_stack_ptr - 0x10
      | >       ------------
      | >       |   x19    |
      | >       ------------
      | >       |   xzr    | <- irq_stack_ptr - 0x20
      | >       ------------
      |
      | The value being used as the original stack is x29, which in all the
      | tests is sp but without the current frames data, hence there are no
      | missing frames in the output.
      |
      | Jungseok Lee picked it up with a 32bit user space because aarch32
      | can't use x29, so it remains 0 forever. The fix he posted is correct.
      
      This patch fixes the macro and adds some of this wisdom to a comment,
      so that the layout of the IRQ stack is well understood.
      
      Cc: James Morse <james.morse@arm.com>
      Reported-by: NJungseok Lee <jungseoklee85@gmail.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      7596abf2
  7. 08 12月, 2015 2 次提交
  8. 05 12月, 2015 1 次提交
    • C
      arm64: Add trace_hardirqs_off annotation in ret_to_user · db3899a6
      Catalin Marinas 提交于
      When a kernel is built with CONFIG_TRACE_IRQFLAGS the following warning
      is produced when entering userspace for the first time:
      
        WARNING: at /work/Linux/linux-2.6-aarch64/kernel/locking/lockdep.c:3519
        Modules linked in:
        CPU: 1 PID: 1 Comm: systemd Not tainted 4.4.0-rc3+ #639
        Hardware name: Juno (DT)
        task: ffffffc9768a0000 ti: ffffffc9768a8000 task.ti: ffffffc9768a8000
        PC is at check_flags.part.22+0x19c/0x1a8
        LR is at check_flags.part.22+0x19c/0x1a8
        pc : [<ffffffc0000fba6c>] lr : [<ffffffc0000fba6c>] pstate: 600001c5
        sp : ffffffc9768abe10
        x29: ffffffc9768abe10 x28: ffffffc9768a8000
        x27: 0000000000000000 x26: 0000000000000001
        x25: 00000000000000a6 x24: ffffffc00064be6c
        x23: ffffffc0009f249e x22: ffffffc9768a0000
        x21: ffffffc97fea5480 x20: 00000000000001c0
        x19: ffffffc00169a000 x18: 0000005558cc7b58
        x17: 0000007fb78e3180 x16: 0000005558d2e238
        x15: ffffffffffffffff x14: 0ffffffffffffffd
        x13: 0000000000000008 x12: 0101010101010101
        x11: 7f7f7f7f7f7f7f7f x10: fefefefefefeff63
        x9 : 7f7f7f7f7f7f7f7f x8 : 6e655f7371726964
        x7 : 0000000000000001 x6 : ffffffc0001079c4
        x5 : 0000000000000000 x4 : 0000000000000001
        x3 : ffffffc001698438 x2 : 0000000000000000
        x1 : ffffffc9768a0000 x0 : 000000000000002e
        Call trace:
        [<ffffffc0000fba6c>] check_flags.part.22+0x19c/0x1a8
        [<ffffffc0000fc440>] lock_is_held+0x80/0x98
        [<ffffffc00064bafc>] __schedule+0x404/0x730
        [<ffffffc00064be6c>] schedule+0x44/0xb8
        [<ffffffc000085bb0>] ret_to_user+0x0/0x24
        possible reason: unannotated irqs-off.
        irq event stamp: 502169
        hardirqs last  enabled at (502169): [<ffffffc000085a98>] el0_irq_naked+0x1c/0x24
        hardirqs last disabled at (502167): [<ffffffc0000bb3bc>] __do_softirq+0x17c/0x298
        softirqs last  enabled at (502168): [<ffffffc0000bb43c>] __do_softirq+0x1fc/0x298
        softirqs last disabled at (502143): [<ffffffc0000bb830>] irq_exit+0xa0/0xf0
      
      This happens because we disable interrupts in ret_to_user before calling
      schedule() in work_resched. This patch adds the necessary
      trace_hardirqs_off annotation.
      Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
      Reported-by: NMark Rutland <mark.rutland@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      db3899a6
  9. 16 10月, 2015 1 次提交
  10. 21 8月, 2015 1 次提交
    • W
      arm64: entry: always restore x0 from the stack on syscall return · 412fcb6c
      Will Deacon 提交于
      We have a micro-optimisation on the fast syscall return path where we
      take care to keep x0 live with the return value from the syscall so that
      we can avoid restoring it from the stack. The benefit of doing this is
      fairly suspect, since we will be restoring x1 from the stack anyway
      (which lives adjacent in the pt_regs structure) and the only additional
      cost is saving x0 back to pt_regs after the syscall handler, which could
      be seen as a poor man's prefetch.
      
      More importantly, this causes issues with the context tracking code.
      
      The ct_user_enter macro ends up branching into C code, which is free to
      use x0 as a scratch register and consequently leads to us returning junk
      back to userspace as the syscall return value. Rather than special case
      the context-tracking code, this patch removes the questionable
      optimisation entirely.
      
      Cc: <stable@vger.kernel.org>
      Cc: Larry Bassel <larry.bassel@linaro.org>
      Cc: Kevin Hilman <khilman@linaro.org>
      Reviewed-by: NCatalin Marinas <catalin.marinas@arm.com>
      Reported-by: NHanjun Guo <hanjun.guo@linaro.org>
      Tested-by: NHanjun Guo <hanjun.guo@linaro.org>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      412fcb6c
  11. 27 7月, 2015 1 次提交
  12. 22 7月, 2015 1 次提交
  13. 09 7月, 2015 1 次提交
  14. 17 6月, 2015 1 次提交
    • M
      arm64: entry: fix context tracking for el0_sp_pc · 46b0567c
      Mark Rutland 提交于
      Commit 6c81fe79 ("arm64: enable context tracking") did not
      update el0_sp_pc to use ct_user_exit, but this appears to have been
      unintentional. In commit 6ab6463a ("arm64: adjust el0_sync so
      that a function can be called") we made x0 available, and in the return
      to userspace we call ct_user_enter in the kernel_exit macro.
      
      Due to this, we currently don't correctly inform RCU of the user->kernel
      transition, and may erroneously account for time spent in the kernel as
      if we were in an extended quiescent state when CONFIG_CONTEXT_TRACKING
      is enabled.
      
      As we do record the kernel->user transition, a userspace application
      making accesses from an unaligned stack pointer can demonstrate the
      imbalance, provoking the following warning:
      
      ------------[ cut here ]------------
      WARNING: CPU: 2 PID: 3660 at kernel/context_tracking.c:75 context_tracking_enter+0xd8/0xe4()
      Modules linked in:
      CPU: 2 PID: 3660 Comm: a.out Not tainted 4.1.0-rc7+ #8
      Hardware name: ARM Juno development board (r0) (DT)
      Call trace:
      [<ffffffc000089914>] dump_backtrace+0x0/0x124
      [<ffffffc000089a48>] show_stack+0x10/0x1c
      [<ffffffc0005b3cbc>] dump_stack+0x84/0xc8
      [<ffffffc0000b3214>] warn_slowpath_common+0x98/0xd0
      [<ffffffc0000b330c>] warn_slowpath_null+0x14/0x20
      [<ffffffc00013ada4>] context_tracking_enter+0xd4/0xe4
      [<ffffffc0005b534c>] preempt_schedule_irq+0xd4/0x114
      [<ffffffc00008561c>] el1_preempt+0x4/0x28
      [<ffffffc0001b8040>] exit_files+0x38/0x4c
      [<ffffffc0000b5b94>] do_exit+0x430/0x978
      [<ffffffc0000b614c>] do_group_exit+0x40/0xd4
      [<ffffffc0000c0208>] get_signal+0x23c/0x4f4
      [<ffffffc0000890b4>] do_signal+0x1ac/0x518
      [<ffffffc000089650>] do_notify_resume+0x5c/0x68
      ---[ end trace 963c192600337066 ]---
      
      This patch adds the missing ct_user_exit to the el0_sp_pc entry path,
      correcting the context tracking for this case.
      Signed-off-by: NMark Rutland <mark.rutland@arm.com>
      Acked-by: NWill Deacon <will.deacon@arm.com>
      Fixes: 6c81fe79 ("arm64: enable context tracking")
      Cc: <stable@vger.kernel.org> # v3.17+
      Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
      46b0567c
  15. 09 6月, 2015 1 次提交
    • J
      arm64: fix missing syscall trace exit · 04d7e098
      Josh Stone 提交于
      If a syscall is entered without TIF_SYSCALL_TRACE set, then it goes on
      the fast path.  It's then possible to have TIF_SYSCALL_TRACE added in
      the middle of the syscall, but ret_fast_syscall doesn't check this flag
      again.  This causes a ptrace syscall-exit-stop to be missed.
      
      For instance, from a PTRACE_EVENT_FORK reported during do_fork, the
      tracer might resume with PTRACE_SYSCALL, setting TIF_SYSCALL_TRACE.
      Now the completion of the fork should have a syscall-exit-stop.
      
      Russell King fixed this on arm by re-checking _TIF_SYSCALL_WORK in the
      fast exit path.  Do the same on arm64.
      Reviewed-by: NWill Deacon <will.deacon@arm.com>
      Cc: Russell King <rmk+kernel@arm.linux.org.uk>
      Signed-off-by: NJosh Stone <jistone@redhat.com>
      Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
      04d7e098
  16. 05 6月, 2015 2 次提交
  17. 01 4月, 2015 1 次提交
  18. 27 1月, 2015 1 次提交
  19. 15 1月, 2015 1 次提交
  20. 28 11月, 2014 1 次提交
    • A
      arm64: ptrace: allow tracer to skip a system call · 1014c81d
      AKASHI Takahiro 提交于
      If tracer modifies a syscall number to -1, this traced system call should
      be skipped with a return value specified in x0.
      This patch implements this semantics.
      
      Please note:
      * syscall entry tracing and syscall exit tracing (ftrace tracepoint and
        audit) are always executed, if enabled, even when skipping a system call
        (that is, -1).
        In this way, we can avoid a potential bug where audit_syscall_entry()
        might be called without audit_syscall_exit() at the previous system call
        being called, that would cause OOPs in audit_syscall_entry().
      Signed-off-by: NAKASHI Takahiro <takahiro.akashi@linaro.org>
      [will: fixed up conflict with blr rework]
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      1014c81d
  21. 25 11月, 2014 1 次提交
  22. 14 11月, 2014 2 次提交
    • W
      arm64: entry: use ldp/stp instead of push/pop when saving/restoring regs · 63648dd2
      Will Deacon 提交于
      The push/pop instructions can be suboptimal when saving/restoring large
      amounts of data to/from the stack, for example on entry/exit from the
      kernel. This is because:
      
        (1) They act on descending addresses (i.e. the newly decremented sp),
            which may defeat some hardware prefetchers
      
        (2) They introduce an implicit dependency between each instruction, as
            the sp has to be updated in order to resolve the address of the
            next access.
      
      This patch removes the push/pop instructions from our kernel entry/exit
      macros in favour of ldp/stp plus offset.
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      63648dd2
    • W
      arm64: entry: avoid writing lr explicitly for constructing return paths · d54e81f9
      Will Deacon 提交于
      Using an explicit adr instruction to set the link register to point at
      ret_fast_syscall/ret_to_user can defeat branch and return stack predictors.
      
      Instead, use the standard calling instructions (bl, blr) and have an
      unconditional branch as the following instruction.
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      d54e81f9
  23. 23 9月, 2014 1 次提交
  24. 10 7月, 2014 3 次提交
  25. 18 6月, 2014 1 次提交
  26. 12 5月, 2014 3 次提交
  27. 08 5月, 2014 1 次提交
    • A
      arm64: defer reloading a task's FPSIMD state to userland resume · 005f78cd
      Ard Biesheuvel 提交于
      If a task gets scheduled out and back in again and nothing has touched
      its FPSIMD state in the mean time, there is really no reason to reload
      it from memory. Similarly, repeated calls to kernel_neon_begin() and
      kernel_neon_end() will preserve and restore the FPSIMD state every time.
      
      This patch defers the FPSIMD state restore to the last possible moment,
      i.e., right before the task returns to userland. If a task does not return to
      userland at all (for any reason), the existing FPSIMD state is preserved
      and may be reused by the owning task if it gets scheduled in again on the
      same CPU.
      
      This patch adds two more functions to abstract away from straight FPSIMD
      register file saves and restores:
      - fpsimd_restore_current_state -> ensure current's FPSIMD state is loaded
      - fpsimd_flush_task_state -> invalidate live copies of a task's FPSIMD state
      Signed-off-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
      005f78cd
  28. 13 1月, 2014 1 次提交
  29. 20 12月, 2013 1 次提交
  30. 26 11月, 2013 1 次提交
    • M
      arm64: let the core code deal with preempt_count · 64681787
      Marc Zyngier 提交于
      Commit f27dde8d (sched: Add NEED_RESCHED to the preempt_count)
      introduced the use of bit 31 in preempt_count for obscure scheduling
      purposes.
      
      This causes interrupts taken from EL0 to hit the (open coded) BUG when
      this flag is flipped while handling the interrupt (we compare the
      values before and after, and kill the kernel if they are different).
      
      The fix is to stop messing with the preempt count entirely, as this
      is already being dealt with in the generic code (irq_enter/irq_exit).
      
      Tested on a dual A53 FPGA running cyclictest.
      Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
      Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
      64681787
  31. 05 11月, 2013 1 次提交
  32. 03 9月, 2013 1 次提交