1. 12 11月, 2016 8 次提交
    • M
      arm64: smp: prepare for smp_processor_id() rework · 580efaa7
      Mark Rutland 提交于
      Subsequent patches will make smp_processor_id() use a percpu variable.
      This will make smp_processor_id() dependent on the percpu offset, and
      thus we cannot use smp_processor_id() to figure out what to initialise
      the offset to.
      
      Prepare for this by initialising the percpu offset based on
      current::cpu, which will work regardless of how smp_processor_id() is
      implemented. Also, make this relationship obvious by placing this code
      together at the start of secondary_start_kernel().
      Signed-off-by: NMark Rutland <mark.rutland@arm.com>
      Tested-by: NLaura Abbott <labbott@redhat.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
      580efaa7
    • M
      arm64: move sp_el0 and tpidr_el1 into cpu_suspend_ctx · 623b476f
      Mark Rutland 提交于
      When returning from idle, we rely on the fact that thread_info lives at
      the end of the kernel stack, and restore this by masking the saved stack
      pointer. Subsequent patches will sever the relationship between the
      stack and thread_info, and to cater for this we must save/restore sp_el0
      explicitly, storing it in cpu_suspend_ctx.
      
      As cpu_suspend_ctx must be doubleword aligned, this leaves us with an
      extra slot in cpu_suspend_ctx. We can use this to save/restore tpidr_el1
      in the same way, which simplifies the code, avoiding pointer chasing on
      the restore path (as we no longer need to load thread_info::cpu followed
      by the relevant slot in __per_cpu_offset based on this).
      
      This patch stashes both registers in cpu_suspend_ctx.
      Signed-off-by: NMark Rutland <mark.rutland@arm.com>
      Tested-by: NLaura Abbott <labbott@redhat.com>
      Cc: James Morse <james.morse@arm.com>
      Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
      623b476f
    • M
      arm64: prep stack walkers for THREAD_INFO_IN_TASK · 9bbd4c56
      Mark Rutland 提交于
      When CONFIG_THREAD_INFO_IN_TASK is selected, task stacks may be freed
      before a task is destroyed. To account for this, the stacks are
      refcounted, and when manipulating the stack of another task, it is
      necessary to get/put the stack to ensure it isn't freed and/or re-used
      while we do so.
      
      This patch reworks the arm64 stack walking code to account for this.
      When CONFIG_THREAD_INFO_IN_TASK is not selected these perform no
      refcounting, and this should only be a structural change that does not
      affect behaviour.
      Signed-off-by: NMark Rutland <mark.rutland@arm.com>
      Tested-by: NLaura Abbott <labbott@redhat.com>
      Cc: AKASHI Takahiro <takahiro.akashi@linaro.org>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: James Morse <james.morse@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
      9bbd4c56
    • M
      arm64: unexport walk_stackframe · 2020a5ae
      Mark Rutland 提交于
      The walk_stackframe functions is architecture-specific, with a varying
      prototype, and common code should not use it directly. None of its
      current users can be built as modules. With THREAD_INFO_IN_TASK, users
      will also need to hold a stack reference before calling it.
      
      There's no reason for it to be exported, and it's very easy to misuse,
      so unexport it for now.
      Signed-off-by: NMark Rutland <mark.rutland@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
      2020a5ae
    • M
      arm64: traps: simplify die() and __die() · 876e7a38
      Mark Rutland 提交于
      In arm64's die and __die routines we pass around a thread_info, and
      subsequently use this to determine the relevant task_struct, and the end
      of the thread's stack. Subsequent patches will decouple thread_info from
      the stack, and this approach will no longer work.
      
      To figure out the end of the stack, we can use the new generic
      end_of_stack() helper. As we only call __die() from die(), and die()
      always deals with the current task, we can remove the parameter and have
      both acquire current directly, which also makes it clear that __die
      can't be called for arbitrary tasks.
      Signed-off-by: NMark Rutland <mark.rutland@arm.com>
      Tested-by: NLaura Abbott <labbott@redhat.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
      876e7a38
    • M
      arm64: factor out current_stack_pointer · a9ea0017
      Mark Rutland 提交于
      We define current_stack_pointer in <asm/thread_info.h>, though other
      files and header relying upon it do not have this necessary include, and
      are thus fragile to changes in the header soup.
      
      Subsequent patches will affect the header soup such that directly
      including <asm/thread_info.h> may result in a circular header include in
      some of these cases, so we can't simply include <asm/thread_info.h>.
      
      Instead, factor current_thread_info into its own header, and have all
      existing users include this explicitly.
      Signed-off-by: NMark Rutland <mark.rutland@arm.com>
      Tested-by: NLaura Abbott <labbott@redhat.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
      a9ea0017
    • M
      arm64: asm-offsets: remove unused definitions · 3fe12da4
      Mark Rutland 提交于
      Subsequent patches will move the thread_info::{task,cpu} fields, and the
      current TI_{TASK,CPU} offset definitions are not used anywhere.
      
      This patch removes the redundant definitions.
      Signed-off-by: NMark Rutland <mark.rutland@arm.com>
      Tested-by: NLaura Abbott <labbott@redhat.com>
      Cc: James Morse <james.morse@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
      3fe12da4
    • M
      arm64: thread_info remove stale items · dcbe0285
      Mark Rutland 提交于
      We have a comment claiming __switch_to() cares about where cpu_context
      is located relative to cpu_domain in thread_info. However arm64 has
      never had a thread_info::cpu_domain field, and neither __switch_to nor
      cpu_switch_to care where the cpu_context field is relative to others.
      
      Additionally, the init_thread_info alias is never used anywhere in the
      kernel, and will shortly become problematic when thread_info is moved
      into task_struct.
      
      This patch removes both.
      Signed-off-by: NMark Rutland <mark.rutland@arm.com>
      Tested-by: NLaura Abbott <labbott@redhat.com>
      Cc: James Morse <james.morse@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
      dcbe0285
  2. 10 11月, 2016 4 次提交
  3. 08 11月, 2016 18 次提交
  4. 27 10月, 2016 3 次提交
    • N
      arm64: mm: fix __page_to_voff definition · 3fa72fe9
      Neeraj Upadhyay 提交于
      Fix parameter name for __page_to_voff, to match its definition.
      At present, we don't see any issue, as page_to_virt's caller
      declares 'page'.
      
      Fixes: 9f287591 ("arm64: mm: restrict virt_to_page() to the linear mapping")
      Acked-by: NMark Rutland <mark.rutland@arm.com>
      Acked-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
      Signed-off-by: NNeeraj Upadhyay <neeraju@codeaurora.org>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      3fa72fe9
    • H
      arm64/numa: fix incorrect log for memory-less node · 3f7a09f4
      Hanjun Guo 提交于
      When booting on NUMA system with memory-less node (no
      memory dimm on this memory controller), the print
      for setup_node_data() is incorrect:
      
      NUMA: Initmem setup node 2 [mem 0x00000000-0xffffffffffffffff]
      
      It can be fixed by printing [mem 0x00000000-0x00000000] when
      end_pfn is 0, but print <memory-less node> will be more useful.
      
      Fixes: 1a2db300 ("arm64, numa: Add NUMA support for arm64 platforms.")
      Signed-off-by: NHanjun Guo <hanjun.guo@linaro.org>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Ganapatrao Kulkarni <gkulkarni@caviumnetworks.com>
      Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Yisheng Xie <xieyisheng1@huawei.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      3f7a09f4
    • Y
      arm64/numa: fix pcpu_cpu_distance() to get correct CPU proximity · 26984c3b
      Yisheng Xie 提交于
      The pcpu_build_alloc_info() function group CPUs according to their
      proximity, by call callback function @cpu_distance_fn from different
      ARCHs.
      
      For arm64 the callback of @cpu_distance_fn is
          pcpu_cpu_distance(from, to)
              -> node_distance(from, to)
      The @from and @to for function node_distance() should be nid.
      
      However, pcpu_cpu_distance() in arch/arm64/mm/numa.c just past the
      cpu id for @from and @to, and didn't convert to numa node id.
      
      For this incorrect cpu proximity get from ARCH, it may cause each CPU
      in one group and make group_cnt out of bound:
      
      	setup_per_cpu_areas()
      		pcpu_embed_first_chunk()
      			pcpu_build_alloc_info()
      in pcpu_build_alloc_info, since cpu_distance_fn will return
      REMOTE_DISTANCE if we pass cpu ids (0,1,2...), so
      cpu_distance_fn(cpu, tcpu) > LOCAL_DISTANCE will wrongly be ture.
      
      This may results in triggering the BUG_ON(unit != nr_units) later:
      
      [    0.000000] kernel BUG at mm/percpu.c:1916!
      [    0.000000] Internal error: Oops - BUG: 0 [#1] PREEMPT SMP
      [    0.000000] Modules linked in:
      [    0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted 4.9.0-rc1-00003-g14155caf-dirty #26
      [    0.000000] Hardware name: Hisilicon Hi1616 Evaluation Board (DT)
      [    0.000000] task: ffff000008d6e900 task.stack: ffff000008d60000
      [    0.000000] PC is at pcpu_embed_first_chunk+0x420/0x704
      [    0.000000] LR is at pcpu_embed_first_chunk+0x3bc/0x704
      [    0.000000] pc : [<ffff000008c754f4>] lr : [<ffff000008c75490>] pstate: 800000c5
      [    0.000000] sp : ffff000008d63eb0
      [    0.000000] x29: ffff000008d63eb0 [    0.000000] x28: 0000000000000000
      [    0.000000] x27: 0000000000000040 [    0.000000] x26: ffff8413fbfcef00
      [    0.000000] x25: 0000000000000042 [    0.000000] x24: 0000000000000042
      [    0.000000] x23: 0000000000001000 [    0.000000] x22: 0000000000000046
      [    0.000000] x21: 0000000000000001 [    0.000000] x20: ffff000008cb3bc8
      [    0.000000] x19: ffff8413fbfcf570 [    0.000000] x18: 0000000000000000
      [    0.000000] x17: ffff000008e49ae0 [    0.000000] x16: 0000000000000003
      [    0.000000] x15: 000000000000001e [    0.000000] x14: 0000000000000004
      [    0.000000] x13: 0000000000000000 [    0.000000] x12: 000000000000006f
      [    0.000000] x11: 00000413fbffff00 [    0.000000] x10: 0000000000000004
      [    0.000000] x9 : 0000000000000000 [    0.000000] x8 : 0000000000000001
      [    0.000000] x7 : ffff8413fbfcf63c [    0.000000] x6 : ffff000008d65d28
      [    0.000000] x5 : ffff000008d65e50 [    0.000000] x4 : 0000000000000000
      [    0.000000] x3 : ffff000008cb3cc8 [    0.000000] x2 : 0000000000000040
      [    0.000000] x1 : 0000000000000040 [    0.000000] x0 : 0000000000000000
      [...]
      [    0.000000] Call trace:
      [    0.000000] Exception stack(0xffff000008d63ce0 to 0xffff000008d63e10)
      [    0.000000] 3ce0: ffff8413fbfcf570 0001000000000000 ffff000008d63eb0 ffff000008c754f4
      [    0.000000] 3d00: ffff000008d63d50 ffff0000081af210 00000413fbfff010 0000000000001000
      [    0.000000] 3d20: ffff000008d63d50 ffff0000081af220 00000413fbfff010 0000000000001000
      [    0.000000] 3d40: 00000413fbfcef00 0000000000000004 ffff000008d63db0 ffff0000081af390
      [    0.000000] 3d60: 00000413fbfcef00 0000000000001000 0000000000000000 0000000000001000
      [    0.000000] 3d80: 0000000000000000 0000000000000040 0000000000000040 ffff000008cb3cc8
      [    0.000000] 3da0: 0000000000000000 ffff000008d65e50 ffff000008d65d28 ffff8413fbfcf63c
      [    0.000000] 3dc0: 0000000000000001 0000000000000000 0000000000000004 00000413fbffff00
      [    0.000000] 3de0: 000000000000006f 0000000000000000 0000000000000004 000000000000001e
      [    0.000000] 3e00: 0000000000000003 ffff000008e49ae0
      [    0.000000] [<ffff000008c754f4>] pcpu_embed_first_chunk+0x420/0x704
      [    0.000000] [<ffff000008c6658c>] setup_per_cpu_areas+0x38/0xc8
      [    0.000000] [<ffff000008c608d8>] start_kernel+0x10c/0x390
      [    0.000000] [<ffff000008c601d8>] __primary_switched+0x5c/0x64
      [    0.000000] Code: b8018660 17ffffd7 6b16037f 54000080 (d4210000)
      [    0.000000] ---[ end trace 0000000000000000 ]---
      [    0.000000] Kernel panic - not syncing: Attempted to kill the idle task!
      
      Fix by getting cpu's node id with early_cpu_to_node() then pass it
      to node_distance() as the original intention.
      
      Fixes: 7af3a0a9 ("arm64/numa: support HAVE_SETUP_PER_CPU_AREA")
      Signed-off-by: NYisheng Xie <xieyisheng1@huawei.com>
      Signed-off-by: NHanjun Guo <hanjun.guo@linaro.org>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Zhen Lei <thunder.leizhen@huawei.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      26984c3b
  5. 24 10月, 2016 1 次提交
  6. 22 10月, 2016 3 次提交
    • M
      arm64: dts: uniphier: change MIO node to SD control node · 8e68c65d
      Masahiro Yamada 提交于
      I made a mistake bacuse the Media I/O block is not implemented in
      this SoC.
      Signed-off-by: NMasahiro Yamada <yamada.masahiro@socionext.com>
      8e68c65d
    • M
      arm64: uniphier: select ARCH_HAS_RESET_CONTROLLER · 75924903
      Masahiro Yamada 提交于
      The UniPhier reset driver (drivers/reset/reset-uniphier.c) has been
      merged.  Select ARCH_HAS_RESET_CONTROLLER from the SoC Kconfig.
      Signed-off-by: NMasahiro Yamada <yamada.masahiro@socionext.com>
      75924903
    • W
      arm64: KVM: Take S1 walks into account when determining S2 write faults · 60e21a0e
      Will Deacon 提交于
      The WnR bit in the HSR/ESR_EL2 indicates whether a data abort was
      generated by a read or a write instruction. For stage 2 data aborts
      generated by a stage 1 translation table walk (i.e. the actual page
      table access faults at EL2), the WnR bit therefore reports whether the
      instruction generating the walk was a load or a store, *not* whether the
      page table walker was reading or writing the entry.
      
      For page tables marked as read-only at stage 2 (e.g. due to KSM merging
      them with the tables from another guest), this could result in livelock,
      where a page table walk generated by a load instruction attempts to
      set the access flag in the stage 1 descriptor, but fails to trigger
      CoW in the host since only a read fault is reported.
      
      This patch modifies the arm64 kvm_vcpu_dabt_iswrite function to
      take into account stage 2 faults in stage 1 walks. Since DBM cannot be
      disabled at EL2 for CPUs that implement it, we assume that these faults
      are always causes by writes, avoiding the livelock situation at the
      expense of occasional, spurious CoWs.
      
      We could, in theory, do a bit better by checking the guest TCR
      configuration and inspecting the page table to see why the PTE faulted.
      However, I doubt this is measurable in practice, and the threat of
      livelock is real.
      
      Cc: <stable@vger.kernel.org>
      Cc: Julien Grall <julien.grall@arm.com>
      Reviewed-by: NMarc Zyngier <marc.zyngier@arm.com>
      Reviewed-by: NChristoffer Dall <christoffer.dall@linaro.org>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      60e21a0e
  7. 21 10月, 2016 1 次提交
  8. 20 10月, 2016 2 次提交
    • M
      arm64: remove pr_cont abuse from mem_init · f7881bd6
      Mark Rutland 提交于
      All the lines printed by mem_init are independent, with each ending with
      a newline. While they logically form a large block, none are actually
      continuations of previous lines.
      
      The kernel-side printk code and the userspace demsg tool differ in their
      handling of KERN_CONT following a newline, and while this isn't always a
      problem kernel-side, it does cause difficulty for userspace. Using
      pr_cont causes the userspace tool to not print line prefix (e.g.
      timestamps) even when following a newline, mis-aligning the output and
      making it harder to read, e.g.
      
      [    0.000000] Virtual kernel memory layout:
      [    0.000000]     modules : 0xffff000000000000 - 0xffff000008000000   (   128 MB)
          vmalloc : 0xffff000008000000 - 0xffff7dffbfff0000   (129022 GB)
            .text : 0xffff000008080000 - 0xffff0000088b0000   (  8384 KB)
          .rodata : 0xffff0000088b0000 - 0xffff000008c50000   (  3712 KB)
            .init : 0xffff000008c50000 - 0xffff000008d50000   (  1024 KB)
            .data : 0xffff000008d50000 - 0xffff000008e25200   (   853 KB)
             .bss : 0xffff000008e25200 - 0xffff000008e6bec0   (   284 KB)
          fixed   : 0xffff7dfffe7fd000 - 0xffff7dfffec00000   (  4108 KB)
          PCI I/O : 0xffff7dfffee00000 - 0xffff7dffffe00000   (    16 MB)
          vmemmap : 0xffff7e0000000000 - 0xffff800000000000   (  2048 GB maximum)
                    0xffff7e0000000000 - 0xffff7e0026000000   (   608 MB actual)
          memory  : 0xffff800000000000 - 0xffff800980000000   ( 38912 MB)
      [    0.000000] SLUB: HWalign=64, Order=0-3, MinObjects=0, CPUs=6, Nodes=1
      
      Fix this by using pr_notice consistently for all lines, which both the
      kernel and userspace are happy with.
      Signed-off-by: NMark Rutland <mark.rutland@arm.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: James Morse <james.morse@arm.com>
      Cc: Kefeng Wang <wangkefeng.wang@huawei.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      f7881bd6
    • M
      arm64: fix show_regs fallout from KERN_CONT changes · db4b0710
      Mark Rutland 提交于
      Recently in commit 4bcc595c ("printk: reinstate KERN_CONT for
      printing continuation lines"), the behaviour of printk changed w.r.t.
      KERN_CONT. Now, KERN_CONT is mandatory to continue existing lines.
      Without this, prefixes are inserted, making output illegible, e.g.
      
      [ 1007.069010] pc : [<ffff00000871898c>] lr : [<ffff000008718948>] pstate: 40000145
      [ 1007.076329] sp : ffff000008d53ec0
      [ 1007.079606] x29: ffff000008d53ec0 [ 1007.082797] x28: 0000000080c50018
      [ 1007.086160]
      [ 1007.087630] x27: ffff000008e0c7f8 [ 1007.090820] x26: ffff80097631ca00
      [ 1007.094183]
      [ 1007.095653] x25: 0000000000000001 [ 1007.098843] x24: 000000ea68b61cac
      [ 1007.102206]
      
      ... or when dumped with the userpace dmesg tool, which has slightly
      different implicit newline behaviour. e.g.
      
      [ 1007.069010] pc : [<ffff00000871898c>] lr : [<ffff000008718948>] pstate: 40000145
      [ 1007.076329] sp : ffff000008d53ec0
      [ 1007.079606] x29: ffff000008d53ec0
      [ 1007.082797] x28: 0000000080c50018
      [ 1007.086160]
      [ 1007.087630] x27: ffff000008e0c7f8
      [ 1007.090820] x26: ffff80097631ca00
      [ 1007.094183]
      [ 1007.095653] x25: 0000000000000001
      [ 1007.098843] x24: 000000ea68b61cac
      [ 1007.102206]
      
      We can't simply always use KERN_CONT for lines which may or may not be
      continuations. That causes line prefixes (e.g. timestamps) to be
      supressed, and the alignment of all but the first line will be broken.
      
      For even more fun, we can't simply insert some dummy empty-string printk
      calls, as GCC warns for an empty printk string, and even if we pass
      KERN_DEFAULT explcitly to silence the warning, the prefix gets swallowed
      unless there is an additional part to the string.
      
      Instead, we must manually iterate over pairs of registers, which gives
      us the legible output we want in either case, e.g.
      
      [  169.771790] pc : [<ffff00000871898c>] lr : [<ffff000008718948>] pstate: 40000145
      [  169.779109] sp : ffff000008d53ec0
      [  169.782386] x29: ffff000008d53ec0 x28: 0000000080c50018
      [  169.787650] x27: ffff000008e0c7f8 x26: ffff80097631de00
      [  169.792913] x25: 0000000000000001 x24: 00000027827b2cf4
      Signed-off-by: NMark Rutland <mark.rutland@arm.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      db4b0710