1. 12 11月, 2016 7 次提交
    • M
      arm64: smp: prepare for smp_processor_id() rework · 580efaa7
      Mark Rutland 提交于
      Subsequent patches will make smp_processor_id() use a percpu variable.
      This will make smp_processor_id() dependent on the percpu offset, and
      thus we cannot use smp_processor_id() to figure out what to initialise
      the offset to.
      
      Prepare for this by initialising the percpu offset based on
      current::cpu, which will work regardless of how smp_processor_id() is
      implemented. Also, make this relationship obvious by placing this code
      together at the start of secondary_start_kernel().
      Signed-off-by: NMark Rutland <mark.rutland@arm.com>
      Tested-by: NLaura Abbott <labbott@redhat.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
      580efaa7
    • M
      arm64: move sp_el0 and tpidr_el1 into cpu_suspend_ctx · 623b476f
      Mark Rutland 提交于
      When returning from idle, we rely on the fact that thread_info lives at
      the end of the kernel stack, and restore this by masking the saved stack
      pointer. Subsequent patches will sever the relationship between the
      stack and thread_info, and to cater for this we must save/restore sp_el0
      explicitly, storing it in cpu_suspend_ctx.
      
      As cpu_suspend_ctx must be doubleword aligned, this leaves us with an
      extra slot in cpu_suspend_ctx. We can use this to save/restore tpidr_el1
      in the same way, which simplifies the code, avoiding pointer chasing on
      the restore path (as we no longer need to load thread_info::cpu followed
      by the relevant slot in __per_cpu_offset based on this).
      
      This patch stashes both registers in cpu_suspend_ctx.
      Signed-off-by: NMark Rutland <mark.rutland@arm.com>
      Tested-by: NLaura Abbott <labbott@redhat.com>
      Cc: James Morse <james.morse@arm.com>
      Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
      623b476f
    • M
      arm64: prep stack walkers for THREAD_INFO_IN_TASK · 9bbd4c56
      Mark Rutland 提交于
      When CONFIG_THREAD_INFO_IN_TASK is selected, task stacks may be freed
      before a task is destroyed. To account for this, the stacks are
      refcounted, and when manipulating the stack of another task, it is
      necessary to get/put the stack to ensure it isn't freed and/or re-used
      while we do so.
      
      This patch reworks the arm64 stack walking code to account for this.
      When CONFIG_THREAD_INFO_IN_TASK is not selected these perform no
      refcounting, and this should only be a structural change that does not
      affect behaviour.
      Signed-off-by: NMark Rutland <mark.rutland@arm.com>
      Tested-by: NLaura Abbott <labbott@redhat.com>
      Cc: AKASHI Takahiro <takahiro.akashi@linaro.org>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: James Morse <james.morse@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
      9bbd4c56
    • M
      arm64: unexport walk_stackframe · 2020a5ae
      Mark Rutland 提交于
      The walk_stackframe functions is architecture-specific, with a varying
      prototype, and common code should not use it directly. None of its
      current users can be built as modules. With THREAD_INFO_IN_TASK, users
      will also need to hold a stack reference before calling it.
      
      There's no reason for it to be exported, and it's very easy to misuse,
      so unexport it for now.
      Signed-off-by: NMark Rutland <mark.rutland@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
      2020a5ae
    • M
      arm64: traps: simplify die() and __die() · 876e7a38
      Mark Rutland 提交于
      In arm64's die and __die routines we pass around a thread_info, and
      subsequently use this to determine the relevant task_struct, and the end
      of the thread's stack. Subsequent patches will decouple thread_info from
      the stack, and this approach will no longer work.
      
      To figure out the end of the stack, we can use the new generic
      end_of_stack() helper. As we only call __die() from die(), and die()
      always deals with the current task, we can remove the parameter and have
      both acquire current directly, which also makes it clear that __die
      can't be called for arbitrary tasks.
      Signed-off-by: NMark Rutland <mark.rutland@arm.com>
      Tested-by: NLaura Abbott <labbott@redhat.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
      876e7a38
    • M
      arm64: factor out current_stack_pointer · a9ea0017
      Mark Rutland 提交于
      We define current_stack_pointer in <asm/thread_info.h>, though other
      files and header relying upon it do not have this necessary include, and
      are thus fragile to changes in the header soup.
      
      Subsequent patches will affect the header soup such that directly
      including <asm/thread_info.h> may result in a circular header include in
      some of these cases, so we can't simply include <asm/thread_info.h>.
      
      Instead, factor current_thread_info into its own header, and have all
      existing users include this explicitly.
      Signed-off-by: NMark Rutland <mark.rutland@arm.com>
      Tested-by: NLaura Abbott <labbott@redhat.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
      a9ea0017
    • M
      arm64: asm-offsets: remove unused definitions · 3fe12da4
      Mark Rutland 提交于
      Subsequent patches will move the thread_info::{task,cpu} fields, and the
      current TI_{TASK,CPU} offset definitions are not used anywhere.
      
      This patch removes the redundant definitions.
      Signed-off-by: NMark Rutland <mark.rutland@arm.com>
      Tested-by: NLaura Abbott <labbott@redhat.com>
      Cc: James Morse <james.morse@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
      3fe12da4
  2. 08 11月, 2016 10 次提交
  3. 20 10月, 2016 4 次提交
    • M
      arm64: fix show_regs fallout from KERN_CONT changes · db4b0710
      Mark Rutland 提交于
      Recently in commit 4bcc595c ("printk: reinstate KERN_CONT for
      printing continuation lines"), the behaviour of printk changed w.r.t.
      KERN_CONT. Now, KERN_CONT is mandatory to continue existing lines.
      Without this, prefixes are inserted, making output illegible, e.g.
      
      [ 1007.069010] pc : [<ffff00000871898c>] lr : [<ffff000008718948>] pstate: 40000145
      [ 1007.076329] sp : ffff000008d53ec0
      [ 1007.079606] x29: ffff000008d53ec0 [ 1007.082797] x28: 0000000080c50018
      [ 1007.086160]
      [ 1007.087630] x27: ffff000008e0c7f8 [ 1007.090820] x26: ffff80097631ca00
      [ 1007.094183]
      [ 1007.095653] x25: 0000000000000001 [ 1007.098843] x24: 000000ea68b61cac
      [ 1007.102206]
      
      ... or when dumped with the userpace dmesg tool, which has slightly
      different implicit newline behaviour. e.g.
      
      [ 1007.069010] pc : [<ffff00000871898c>] lr : [<ffff000008718948>] pstate: 40000145
      [ 1007.076329] sp : ffff000008d53ec0
      [ 1007.079606] x29: ffff000008d53ec0
      [ 1007.082797] x28: 0000000080c50018
      [ 1007.086160]
      [ 1007.087630] x27: ffff000008e0c7f8
      [ 1007.090820] x26: ffff80097631ca00
      [ 1007.094183]
      [ 1007.095653] x25: 0000000000000001
      [ 1007.098843] x24: 000000ea68b61cac
      [ 1007.102206]
      
      We can't simply always use KERN_CONT for lines which may or may not be
      continuations. That causes line prefixes (e.g. timestamps) to be
      supressed, and the alignment of all but the first line will be broken.
      
      For even more fun, we can't simply insert some dummy empty-string printk
      calls, as GCC warns for an empty printk string, and even if we pass
      KERN_DEFAULT explcitly to silence the warning, the prefix gets swallowed
      unless there is an additional part to the string.
      
      Instead, we must manually iterate over pairs of registers, which gives
      us the legible output we want in either case, e.g.
      
      [  169.771790] pc : [<ffff00000871898c>] lr : [<ffff000008718948>] pstate: 40000145
      [  169.779109] sp : ffff000008d53ec0
      [  169.782386] x29: ffff000008d53ec0 x28: 0000000080c50018
      [  169.787650] x27: ffff000008e0c7f8 x26: ffff80097631de00
      [  169.792913] x25: 0000000000000001 x24: 00000027827b2cf4
      Signed-off-by: NMark Rutland <mark.rutland@arm.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      db4b0710
    • J
      arm64: suspend: Reconfigure PSTATE after resume from idle · d0854412
      James Morse 提交于
      The suspend/resume path in kernel/sleep.S, as used by cpu-idle, does not
      save/restore PSTATE. As a result of this cpufeatures that were detected
      and have bits in PSTATE get lost when we resume from idle.
      
      UAO gets set appropriately on the next context switch. PAN will be
      re-enabled next time we return from user-space, but on a preemptible
      kernel we may run work accessing user space before this point.
      
      Add code to re-enable theses two features in __cpu_suspend_exit().
      We re-use uao_thread_switch() passing current.
      Signed-off-by: NJames Morse <james.morse@arm.com>
      Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      d0854412
    • J
      arm64: cpufeature: Schedule enable() calls instead of calling them via IPI · 2a6dcb2b
      James Morse 提交于
      The enable() call for a cpufeature/errata is called using on_each_cpu().
      This issues a cross-call IPI to get the work done. Implicitly, this
      stashes the running PSTATE in SPSR when the CPU receives the IPI, and
      restores it when we return. This means an enable() call can never modify
      PSTATE.
      
      To allow PAN to do this, change the on_each_cpu() call to use
      stop_machine(). This schedules the work on each CPU which allows
      us to modify PSTATE.
      
      This involves changing the protype of all the enable() functions.
      
      enable_cpu_capabilities() is called during boot and enables the feature
      on all online CPUs. This path now uses stop_machine(). CPU features for
      hotplug'd CPUs are enabled by verify_local_cpu_features() which only
      acts on the local CPU, and can already modify the running PSTATE as it
      is called from secondary_start_kernel().
      Reported-by: NTony Thompson <anthony.thompson@arm.com>
      Reported-by: NVladimir Murzin <vladimir.murzin@arm.com>
      Signed-off-by: NJames Morse <james.morse@arm.com>
      Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      2a6dcb2b
    • A
      arm64: Cortex-A53 errata workaround: check for kernel addresses · 87261d19
      Andre Przywara 提交于
      Commit 7dd01aef ("arm64: trap userspace "dc cvau" cache operation on
      errata-affected core") adds code to execute cache maintenance instructions
      in the kernel on behalf of userland on CPUs with certain ARM CPU errata.
      It turns out that the address hasn't been checked to be a valid user
      space address, allowing userland to clean cache lines in kernel space.
      Fix this by introducing an address check before executing the
      instructions on behalf of userland.
      
      Since the address doesn't come via a syscall parameter, we can't just
      reject tagged pointers and instead have to remove the tag when checking
      against the user address limit.
      
      Cc: <stable@vger.kernel.org>
      Fixes: 7dd01aef ("arm64: trap userspace "dc cvau" cache operation on errata-affected core")
      Reported-by: NKristina Martsenko <kristina.martsenko@arm.com>
      Signed-off-by: NAndre Przywara <andre.przywara@arm.com>
      [will: rework commit message + replace access_ok with max_user_addr()]
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      87261d19
  4. 19 10月, 2016 1 次提交
    • W
      arm64: swp emulation: bound LL/SC retries before rescheduling · 1c5b51df
      Will Deacon 提交于
      If a CPU does not implement a global monitor for certain memory types,
      then userspace can attempt a kernel DoS by issuing SWP instructions
      targetting the problematic memory (for example, a framebuffer mapped
      with non-cacheable attributes).
      
      The SWP emulation code protects against these sorts of attacks by
      checking for pending signals and potentially rescheduling when the STXR
      instruction fails during the emulation. Whilst this is good for avoiding
      livelock, it harms emulation of legitimate SWP instructions on CPUs
      where forward progress is not guaranteed if there are memory accesses to
      the same reservation granule (up to 2k) between the failing STXR and
      the retry of the LDXR.
      
      This patch solves the problem by retrying the STXR a bounded number of
      times (4) before breaking out of the LL/SC loop and looking for
      something else to do.
      
      Cc: <stable@vger.kernel.org>
      Fixes: bd35a4ad ("arm64: Port SWP/SWPB emulation support from arm")
      Reviewed-by: NMark Rutland <mark.rutland@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      1c5b51df
  5. 17 10月, 2016 2 次提交
    • M
      arm64: kernel: Init MDCR_EL2 even in the absence of a PMU · 85054035
      Marc Zyngier 提交于
      Commit f436b2ac ("arm64: kernel: fix architected PMU registers
      unconditional access") made sure we wouldn't access unimplemented
      PMU registers, but also left MDCR_EL2 uninitialized in that case,
      leading to trap bits being potentially left set.
      
      Make sure we always write something in that register.
      
      Fixes: f436b2ac ("arm64: kernel: fix architected PMU registers unconditional access")
      Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      85054035
    • L
      arm64: kernel: numa: fix ACPI boot cpu numa node mapping · baa5567c
      Lorenzo Pieralisi 提交于
      Commit 7ba5f605 ("arm64/numa: remove the limitation that cpu0 must
      bind to node0") removed the numa cpu<->node mapping restriction whereby
      logical cpu 0 always corresponds to numa node 0; removing the
      restriction was correct, in that it does not really exist in practice
      but the commit only updated the early mapping of logical cpu 0 to its
      real numa node for the DT boot path, missing the ACPI one, leading to
      boot failures on ACPI systems owing to missing node<->cpu map for
      logical cpu 0.
      
      Fix the issue by updating the ACPI boot path with code that carries out
      the early cpu<->node mapping also for the boot cpu (ie cpu 0), mirroring
      what is currently done in the DT boot path.
      
      Fixes: 7ba5f605 ("arm64/numa: remove the limitation that cpu0 must bind to node0")
      Signed-off-by: NLorenzo Pieralisi <lorenzo.pieralisi@arm.com>
      Tested-by: NLaszlo Ersek <lersek@redhat.com>
      Reported-by: NLaszlo Ersek <lersek@redhat.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Laszlo Ersek <lersek@redhat.com>
      Cc: Hanjun Guo <hanjun.guo@linaro.org>
      Cc: Andrew Jones <drjones@redhat.com>
      Cc: Zhen Lei <thunder.leizhen@huawei.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      baa5567c
  6. 16 10月, 2016 1 次提交
    • D
      kprobes: Unpoison stack in jprobe_return() for KASAN · 9f7d416c
      Dmitry Vyukov 提交于
      I observed false KSAN positives in the sctp code, when
      sctp uses jprobe_return() in jsctp_sf_eat_sack().
      
      The stray 0xf4 in shadow memory are stack redzones:
      
      [     ] ==================================================================
      [     ] BUG: KASAN: stack-out-of-bounds in memcmp+0xe9/0x150 at addr ffff88005e48f480
      [     ] Read of size 1 by task syz-executor/18535
      [     ] page:ffffea00017923c0 count:0 mapcount:0 mapping:          (null) index:0x0
      [     ] flags: 0x1fffc0000000000()
      [     ] page dumped because: kasan: bad access detected
      [     ] CPU: 1 PID: 18535 Comm: syz-executor Not tainted 4.8.0+ #28
      [     ] Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      [     ]  ffff88005e48f2d0 ffffffff82d2b849 ffffffff0bc91e90 fffffbfff10971e8
      [     ]  ffffed000bc91e90 ffffed000bc91e90 0000000000000001 0000000000000000
      [     ]  ffff88005e48f480 ffff88005e48f350 ffffffff817d3169 ffff88005e48f370
      [     ] Call Trace:
      [     ]  [<ffffffff82d2b849>] dump_stack+0x12e/0x185
      [     ]  [<ffffffff817d3169>] kasan_report+0x489/0x4b0
      [     ]  [<ffffffff817d31a9>] __asan_report_load1_noabort+0x19/0x20
      [     ]  [<ffffffff82d49529>] memcmp+0xe9/0x150
      [     ]  [<ffffffff82df7486>] depot_save_stack+0x176/0x5c0
      [     ]  [<ffffffff817d2031>] save_stack+0xb1/0xd0
      [     ]  [<ffffffff817d27f2>] kasan_slab_free+0x72/0xc0
      [     ]  [<ffffffff817d05b8>] kfree+0xc8/0x2a0
      [     ]  [<ffffffff85b03f19>] skb_free_head+0x79/0xb0
      [     ]  [<ffffffff85b0900a>] skb_release_data+0x37a/0x420
      [     ]  [<ffffffff85b090ff>] skb_release_all+0x4f/0x60
      [     ]  [<ffffffff85b11348>] consume_skb+0x138/0x370
      [     ]  [<ffffffff8676ad7b>] sctp_chunk_put+0xcb/0x180
      [     ]  [<ffffffff8676ae88>] sctp_chunk_free+0x58/0x70
      [     ]  [<ffffffff8677fa5f>] sctp_inq_pop+0x68f/0xef0
      [     ]  [<ffffffff8675ee36>] sctp_assoc_bh_rcv+0xd6/0x4b0
      [     ]  [<ffffffff8677f2c1>] sctp_inq_push+0x131/0x190
      [     ]  [<ffffffff867bad69>] sctp_backlog_rcv+0xe9/0xa20
      [ ... ]
      [     ] Memory state around the buggy address:
      [     ]  ffff88005e48f380: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
      [     ]  ffff88005e48f400: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
      [     ] >ffff88005e48f480: f4 f4 00 00 00 00 00 00 00 00 00 00 00 00 00 00
      [     ]                    ^
      [     ]  ffff88005e48f500: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
      [     ]  ffff88005e48f580: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
      [     ] ==================================================================
      
      KASAN stack instrumentation poisons stack redzones on function entry
      and unpoisons them on function exit. If a function exits abnormally
      (e.g. with a longjmp like jprobe_return()), stack redzones are left
      poisoned. Later this leads to random KASAN false reports.
      
      Unpoison stack redzones in the frames we are going to jump over
      before doing actual longjmp in jprobe_return().
      Signed-off-by: NDmitry Vyukov <dvyukov@google.com>
      Acked-by: NMasami Hiramatsu <mhiramat@kernel.org>
      Reviewed-by: NMark Rutland <mark.rutland@arm.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
      Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
      Cc: Alexander Potapenko <glider@google.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com>
      Cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Masami Hiramatsu <mhiramat@kernel.org>
      Cc: kasan-dev@googlegroups.com
      Cc: surovegin@google.com
      Cc: rostedt@goodmis.org
      Link: http://lkml.kernel.org/r/1476454043-101898-1-git-send-email-dvyukov@google.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      9f7d416c
  7. 12 10月, 2016 1 次提交
  8. 08 10月, 2016 1 次提交
  9. 28 9月, 2016 1 次提交
  10. 26 9月, 2016 1 次提交
    • M
      arm64: fix dump_backtrace/unwind_frame with NULL tsk · b5e7307d
      Mark Rutland 提交于
      In some places, dump_backtrace() is called with a NULL tsk parameter,
      e.g. in bug_handler() in arch/arm64, or indirectly via show_stack() in
      core code. The expectation is that this is treated as if current were
      passed instead of NULL. Similar is true of unwind_frame().
      
      Commit a80a0eb7 ("arm64: make irq_stack_ptr more robust") didn't
      take this into account. In dump_backtrace() it compares tsk against
      current *before* we check if tsk is NULL, and in unwind_frame() we never
      set tsk if it is NULL.
      
      Due to this, we won't initialise irq_stack_ptr in either function. In
      dump_backtrace() this results in calling dump_mem() for memory
      immediately above the IRQ stack range, rather than for the relevant
      range on the task stack. In unwind_frame we'll reject unwinding frames
      on the IRQ stack.
      
      In either case this results in incomplete or misleading backtrace
      information, but is not otherwise problematic. The initial percpu areas
      (including the IRQ stacks) are allocated in the linear map, and dump_mem
      uses __get_user(), so we shouldn't access anything with side-effects,
      and will handle holes safely.
      
      This patch fixes the issue by having both functions handle the NULL tsk
      case before doing anything else with tsk.
      Signed-off-by: NMark Rutland <mark.rutland@arm.com>
      Fixes: a80a0eb7 ("arm64: make irq_stack_ptr more robust")
      Acked-by: NJames Morse <james.morse@arm.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Yang Shi <yang.shi@linaro.org>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      b5e7307d
  11. 24 9月, 2016 1 次提交
  12. 23 9月, 2016 2 次提交
    • A
      arm64: kgdb: handle read-only text / modules · 67787b68
      AKASHI Takahiro 提交于
      Handle read-only cases when CONFIG_DEBUG_RODATA (4.0) or
      CONFIG_DEBUG_SET_MODULE_RONX (3.18) are enabled by using
      aarch64_insn_write() instead of probe_kernel_write() as introduced by
      commit 2f896d58 ("arm64: use fixmap for text patching") in 4.0.
      
      Fixes: 11d91a77 ("arm64: Add CONFIG_DEBUG_SET_MODULE_RONX support")
      Signed-off-by: NAKASHI Takahiro <takahiro.akashi@linaro.org>
      Reviewed-by: NMark Rutland <mark.rutland@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Jason Wessel <jason.wessel@windriver.com>
      Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
      67787b68
    • D
      arm64: Call numa_store_cpu_info() earlier. · c18df0ad
      David Daney 提交于
      The wq_numa_init() function makes a private CPU to node map by calling
      cpu_to_node() early in the boot process, before the non-boot CPUs are
      brought online.  Since the default implementation of cpu_to_node()
      returns zero for CPUs that have never been brought online, the
      workqueue system's view is that *all* CPUs are on node zero.
      
      When the unbound workqueue for a non-zero node is created, the
      tsk_cpus_allowed() for the worker threads is the empty set because
      there are, in the view of the workqueue system, no CPUs on non-zero
      nodes.  The code in try_to_wake_up() using this empty cpumask ends up
      using the cpumask empty set value of NR_CPUS as an index into the
      per-CPU area pointer array, and gets garbage as it is one past the end
      of the array.  This results in:
      
      [    0.881970] Unable to handle kernel paging request at virtual address fffffb1008b926a4
      [    1.970095] pgd = fffffc00094b0000
      [    1.973530] [fffffb1008b926a4] *pgd=0000000000000000, *pud=0000000000000000, *pmd=0000000000000000
      [    1.982610] Internal error: Oops: 96000004 [#1] SMP
      [    1.987541] Modules linked in:
      [    1.990631] CPU: 48 PID: 295 Comm: cpuhp/48 Tainted: G        W       4.8.0-rc6-preempt-vol+ #9
      [    1.999435] Hardware name: Cavium ThunderX CN88XX board (DT)
      [    2.005159] task: fffffe0fe89cc300 task.stack: fffffe0fe8b8c000
      [    2.011158] PC is at try_to_wake_up+0x194/0x34c
      [    2.015737] LR is at try_to_wake_up+0x150/0x34c
      [    2.020318] pc : [<fffffc00080e7468>] lr : [<fffffc00080e7424>] pstate: 600000c5
      [    2.027803] sp : fffffe0fe8b8fb10
      [    2.031149] x29: fffffe0fe8b8fb10 x28: 0000000000000000
      [    2.036522] x27: fffffc0008c63bc8 x26: 0000000000001000
      [    2.041896] x25: fffffc0008c63c80 x24: fffffc0008bfb200
      [    2.047270] x23: 00000000000000c0 x22: 0000000000000004
      [    2.052642] x21: fffffe0fe89d25bc x20: 0000000000001000
      [    2.058014] x19: fffffe0fe89d1d00 x18: 0000000000000000
      [    2.063386] x17: 0000000000000000 x16: 0000000000000000
      [    2.068760] x15: 0000000000000018 x14: 0000000000000000
      [    2.074133] x13: 0000000000000000 x12: 0000000000000000
      [    2.079505] x11: 0000000000000000 x10: 0000000000000000
      [    2.084879] x9 : 0000000000000000 x8 : 0000000000000000
      [    2.090251] x7 : 0000000000000040 x6 : 0000000000000000
      [    2.095621] x5 : ffffffffffffffff x4 : 0000000000000000
      [    2.100991] x3 : 0000000000000000 x2 : 0000000000000000
      [    2.106364] x1 : fffffc0008be4c24 x0 : ffffff0ffffada80
      [    2.111737]
      [    2.113236] Process cpuhp/48 (pid: 295, stack limit = 0xfffffe0fe8b8c020)
      [    2.120102] Stack: (0xfffffe0fe8b8fb10 to 0xfffffe0fe8b90000)
      [    2.125914] fb00:                                   fffffe0fe8b8fb80 fffffc00080e7648
      .
      .
      .
      [    2.442859] Call trace:
      [    2.445327] Exception stack(0xfffffe0fe8b8f940 to 0xfffffe0fe8b8fa70)
      [    2.451843] f940: fffffe0fe89d1d00 0000040000000000 fffffe0fe8b8fb10 fffffc00080e7468
      [    2.459767] f960: fffffe0fe8b8f980 fffffc00080e4958 ffffff0ff91ab200 fffffc00080e4b64
      [    2.467690] f980: fffffe0fe8b8f9d0 fffffc00080e515c fffffe0fe8b8fa80 0000000000000000
      [    2.475614] f9a0: fffffe0fe8b8f9d0 fffffc00080e58e4 fffffe0fe8b8fa80 0000000000000000
      [    2.483540] f9c0: fffffe0fe8d10000 0000000000000040 fffffe0fe8b8fa50 fffffc00080e5ac4
      [    2.491465] f9e0: ffffff0ffffada80 fffffc0008be4c24 0000000000000000 0000000000000000
      [    2.499387] fa00: 0000000000000000 ffffffffffffffff 0000000000000000 0000000000000040
      [    2.507309] fa20: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
      [    2.515233] fa40: 0000000000000000 0000000000000000 0000000000000000 0000000000000018
      [    2.523156] fa60: 0000000000000000 0000000000000000
      [    2.528089] [<fffffc00080e7468>] try_to_wake_up+0x194/0x34c
      [    2.533723] [<fffffc00080e7648>] wake_up_process+0x28/0x34
      [    2.539275] [<fffffc00080d3764>] create_worker+0x110/0x19c
      [    2.544824] [<fffffc00080d69dc>] alloc_unbound_pwq+0x3cc/0x4b0
      [    2.550724] [<fffffc00080d6bcc>] wq_update_unbound_numa+0x10c/0x1e4
      [    2.557066] [<fffffc00080d7d78>] workqueue_online_cpu+0x220/0x28c
      [    2.563234] [<fffffc00080bd288>] cpuhp_invoke_callback+0x6c/0x168
      [    2.569398] [<fffffc00080bdf74>] cpuhp_up_callbacks+0x44/0xe4
      [    2.575210] [<fffffc00080be194>] cpuhp_thread_fun+0x13c/0x148
      [    2.581027] [<fffffc00080dfbac>] smpboot_thread_fn+0x19c/0x1a8
      [    2.586929] [<fffffc00080dbd64>] kthread+0xdc/0xf0
      [    2.591776] [<fffffc0008083380>] ret_from_fork+0x10/0x50
      [    2.597147] Code: b00057e1 91304021 91005021 b8626822 (b8606821)
      [    2.603464] ---[ end trace 58c0cd36b88802bc ]---
      [    2.608138] Kernel panic - not syncing: Fatal exception
      
      Fix by moving call to numa_store_cpu_info() for all CPUs into
      smp_prepare_cpus(), which happens before wq_numa_init().  Since
      smp_store_cpu_info() now contains only a single function call,
      simplify by removing the function and out-lining its contents.
      Suggested-by: NRobert Richter <rric@kernel.org>
      Fixes: 1a2db300 ("arm64, numa: Add NUMA support for arm64 platforms.")
      Cc: <stable@vger.kernel.org> # 4.7.x-
      Signed-off-by: NDavid Daney <david.daney@cavium.com>
      Reviewed-by: NRobert Richter <rrichter@cavium.com>
      Tested-by: NYisheng Xie <xieyisheng1@huawei.com>
      Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
      c18df0ad
  13. 20 9月, 2016 2 次提交
  14. 17 9月, 2016 3 次提交
  15. 15 9月, 2016 1 次提交
    • D
      arm64: Improve kprobes test for atomic sequence · 3e593f66
      David A. Long 提交于
      Kprobes searches backwards a finite number of instructions to determine if
      there is an attempt to probe a load/store exclusive sequence. It stops when
      it hits the maximum number of instructions or a load or store exclusive.
      However this means it can run up past the beginning of the function and
      start looking at literal constants. This has been shown to cause a false
      positive and blocks insertion of the probe. To fix this, further limit the
      backwards search to stop if it hits a symbol address from kallsyms. The
      presumption is that this is the entry point to this code (particularly for
      the common case of placing probes at the beginning of functions).
      
      This also improves efficiency by not searching code that is not part of the
      function. There may be some possibility that the label might not denote the
      entry path to the probed instruction but the likelihood seems low and this
      is just another example of how the kprobes user really needs to be
      careful about what they are doing.
      Acked-by: NMasami Hiramatsu <mhiramat@kernel.org>
      Signed-off-by: NDavid A. Long <dave.long@linaro.org>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      3e593f66
  16. 12 9月, 2016 1 次提交
    • M
      arm64: use alternative auto-nop · 6ba3b554
      Mark Rutland 提交于
      Make use of the new alternative_if and alternative_else_nop_endif and
      get rid of our homebew NOP sleds, making the code simpler to read.
      
      Note that for cpu_do_switch_mm the ret has been moved out of the
      alternative sequence, and in the default case there will be three
      additional NOPs executed.
      Signed-off-by: NMark Rutland <mark.rutland@arm.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: James Morse <james.morse@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      6ba3b554
  17. 09 9月, 2016 1 次提交
    • S
      arm64: Work around systems with mismatched cache line sizes · 116c81f4
      Suzuki K Poulose 提交于
      Systems with differing CPU i-cache/d-cache line sizes can cause
      problems with the cache management by software when the execution
      is migrated from one to another. Usually, the application reads
      the cache size on a CPU and then uses that length to perform cache
      operations. However, if it gets migrated to another CPU with a smaller
      cache line size, things could go completely wrong. To prevent such
      cases, always use the smallest cache line size among the CPUs. The
      kernel CPU feature infrastructure already keeps track of the safe
      value for all CPUID registers including CTR. This patch works around
      the problem by :
      
      For kernel, dynamically patch the kernel to read the cache size
      from the system wide copy of CTR_EL0.
      
      For applications, trap read accesses to CTR_EL0 (by clearing the SCTLR.UCT)
      and emulate the mrs instruction to return the system wide safe value
      of CTR_EL0.
      
      For faster access (i.e, avoiding to lookup the system wide value of CTR_EL0
      via read_system_reg), we keep track of the pointer to table entry for
      CTR_EL0 in the CPU feature infrastructure.
      
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Andre Przywara <andre.przywara@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Signed-off-by: NSuzuki K Poulose <suzuki.poulose@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      116c81f4