1. 21 6月, 2016 6 次提交
  2. 17 6月, 2016 1 次提交
    • D
      arm64: kgdb: Match pstate size with gdbserver protocol · 0d15ef67
      Daniel Thompson 提交于
      Current versions of gdb do not interoperate cleanly with kgdb on arm64
      systems because gdb and kgdb do not use the same register description.
      This patch modifies kgdb to work with recent releases of gdb (>= 7.8.1).
      
      Compatibility with gdb (after the patch is applied) is as follows:
      
        gdb-7.6 and earlier  Ok
        gdb-7.7 series       Works if user provides custom target description
        gdb-7.8(.0)          Works if user provides custom target description
        gdb-7.8.1 and later  Ok
      
      When commit 44679a4f ("arm64: KGDB: Add step debugging support") was
      introduced it was paired with a gdb patch that made an incompatible
      change to the gdbserver protocol. This patch was eventually merged into
      the gdb sources:
      https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;a=commit;h=a4d9ba85ec5597a6a556afe26b712e878374b9dd
      
      The change to the protocol was mostly made to simplify big-endian support
      inside the kernel gdb stub. Unfortunately the gdb project released
      gdb-7.7.x and gdb-7.8.0 before the protocol incompatibility was identified
      and reversed:
      https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;a=commit;h=bdc144174bcb11e808b4e73089b850cf9620a7ee
      
      This leaves us in a position where kgdb still uses the no-longer-used
      protocol; gdb-7.8.1, which restored the original behaviour, was
      released on 2014-10-29.
      
      I don't believe it is possible to detect/correct the protocol
      incompatiblity which means the kernel must take a view about which
      version of the gdb remote protocol is "correct". This patch takes the
      view that the original/current version of the protocol is correct
      and that version found in gdb-7.7.x and gdb-7.8.0 is anomalous.
      Signed-off-by: NDaniel Thompson <daniel.thompson@linaro.org>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      0d15ef67
  3. 15 6月, 2016 3 次提交
    • W
      arm64: spinlock: Ensure forward-progress in spin_unlock_wait · c56bdcac
      Will Deacon 提交于
      Rather than wait until we observe the lock being free (which might never
      happen), we can also return from spin_unlock_wait if we observe that the
      lock is now held by somebody else, which implies that it was unlocked
      but we just missed seeing it in that state.
      
      Furthermore, in such a scenario there is no longer a need to write back
      the value that we loaded, since we know that there has been a lock
      hand-off, which is sufficient to publish any stores prior to the
      unlock_wait because the ARm architecture ensures that a Store-Release
      instruction is multi-copy atomic when observed by a Load-Acquire
      instruction.
      
      The litmus test is something like:
      
      AArch64
      {
      0:X1=x; 0:X3=y;
      1:X1=y;
      2:X1=y; 2:X3=x;
      }
       P0          | P1           | P2           ;
       MOV W0,#1   | MOV W0,#1    | LDAR W0,[X1] ;
       STR W0,[X1] | STLR W0,[X1] | LDR W2,[X3]  ;
       DMB SY      |              |              ;
       LDR W2,[X3] |              |              ;
      exists
      (0:X2=0 /\ 2:X0=1 /\ 2:X2=0)
      
      where P0 is doing spin_unlock_wait, P1 is doing spin_unlock and P2 is
      doing spin_lock.
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      c56bdcac
    • W
      arm64: spinlock: fix spin_unlock_wait for LSE atomics · 3a5facd0
      Will Deacon 提交于
      Commit d86b8da0 ("arm64: spinlock: serialise spin_unlock_wait against
      concurrent lockers") fixed spin_unlock_wait for LL/SC-based atomics under
      the premise that the LSE atomics (in particular, the LDADDA instruction)
      are indivisible.
      
      Unfortunately, these instructions are only indivisible when used with the
      -AL (full ordering) suffix and, consequently, the same issue can
      theoretically be observed with LSE atomics, where a later (in program
      order) load can be speculated before the write portion of the atomic
      operation.
      
      This patch fixes the issue by performing a CAS of the lock once we've
      established that it's unlocked, in much the same way as the LL/SC code.
      
      Fixes: d86b8da0 ("arm64: spinlock: serialise spin_unlock_wait against concurrent lockers")
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      3a5facd0
    • W
      arm64: spinlock: order spin_{is_locked,unlock_wait} against local locks · 38b850a7
      Will Deacon 提交于
      spin_is_locked has grown two very different use-cases:
      
      (1) [The sane case] API functions may require a certain lock to be held
          by the caller and can therefore use spin_is_locked as part of an
          assert statement in order to verify that the lock is indeed held.
          For example, usage of assert_spin_locked.
      
      (2) [The insane case] There are two locks, where a CPU takes one of the
          locks and then checks whether or not the other one is held before
          accessing some shared state. For example, the "optimized locking" in
          ipc/sem.c.
      
      In the latter case, the sequence looks like:
      
        spin_lock(&sem->lock);
        if (!spin_is_locked(&sma->sem_perm.lock))
          /* Access shared state */
      
      and requires that the spin_is_locked check is ordered after taking the
      sem->lock. Unfortunately, since our spinlocks are implemented using a
      LDAXR/STXR sequence, the read of &sma->sem_perm.lock can be speculated
      before the STXR and consequently return a stale value.
      
      Whilst this hasn't been seen to cause issues in practice, PowerPC fixed
      the same issue in 51d7d520 ("powerpc: Add smp_mb() to
      arch_spin_is_locked()") and, although we did something similar for
      spin_unlock_wait in d86b8da0 ("arm64: spinlock: serialise
      spin_unlock_wait against concurrent lockers") that doesn't actually take
      care of ordering against local acquisition of a different lock.
      
      This patch adds an smp_mb() to the start of our arch_spin_is_locked and
      arch_spin_unlock_wait routines to ensure that the lock value is always
      loaded after any other locks have been taken by the current CPU.
      Reported-by: NPeter Zijlstra <peterz@infradead.org>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      38b850a7
  4. 14 6月, 2016 2 次提交
    • M
      arm64: mm: mark fault_info table const · bbb1681e
      Mark Rutland 提交于
      Unlike the debug_fault_info table, we never intentionally alter the
      fault_info table at runtime, and all derived pointers are treated as
      const currently.
      
      Make the table const so that it can be placed in .rodata and protected
      from unintentional writes, as we do for the syscall tables.
      Signed-off-by: NMark Rutland <mark.rutland@arm.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      bbb1681e
    • M
      arm64: fix dump_instr when PAN and UAO are in use · c5cea06b
      Mark Rutland 提交于
      If the kernel is set to show unhandled signals, and a user task does not
      handle a SIGILL as a result of an instruction abort, we will attempt to
      log the offending instruction with dump_instr before killing the task.
      
      We use dump_instr to log the encoding of the offending userspace
      instruction. However, dump_instr is also used to dump instructions from
      kernel space, and internally always switches to KERNEL_DS before dumping
      the instruction with get_user. When both PAN and UAO are in use, reading
      a user instruction via get_user while in KERNEL_DS will result in a
      permission fault, which leads to an Oops.
      
      As we have regs corresponding to the context of the original instruction
      abort, we can inspect this and only flip to KERNEL_DS if the original
      abort was taken from the kernel, avoiding this issue. At the same time,
      remove the redundant (and incorrect) comments regarding the order
      dump_mem and dump_instr are called in.
      
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: James Morse <james.morse@arm.com>
      Cc: Robin Murphy <robin.murphy@arm.com>
      Cc: <stable@vger.kernel.org> #4.6+
      Signed-off-by: NMark Rutland <mark.rutland@arm.com>
      Reported-by: NVladimir Murzin <vladimir.murzin@arm.com>
      Tested-by: NVladimir Murzin <vladimir.murzin@arm.com>
      Fixes: 57f4959b ("arm64: kernel: Add support for User Access Override")
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      c5cea06b
  5. 08 6月, 2016 1 次提交
    • W
      arm64: mm: always take dirty state from new pte in ptep_set_access_flags · 0106d456
      Will Deacon 提交于
      Commit 66dbd6e6 ("arm64: Implement ptep_set_access_flags() for
      hardware AF/DBM") ensured that pte flags are updated atomically in the
      face of potential concurrent, hardware-assisted updates. However, Alex
      reports that:
      
       | This patch breaks swapping for me.
       | In the broken case, you'll see either systemd cpu time spike (because
       | it's stuck in a page fault loop) or the system hang (because the
       | application owning the screen is stuck in a page fault loop).
      
      It turns out that this is because the 'dirty' argument to
      ptep_set_access_flags is always 0 for read faults, and so we can't use
      it to set PTE_RDONLY. The failing sequence is:
      
        1. We put down a PTE_WRITE | PTE_DIRTY | PTE_AF pte
        2. Memory pressure -> pte_mkold(pte) -> clear PTE_AF
        3. A read faults due to the missing access flag
        4. ptep_set_access_flags is called with dirty = 0, due to the read fault
        5. pte is then made PTE_WRITE | PTE_DIRTY | PTE_AF | PTE_RDONLY (!)
        6. A write faults, but pte_write is true so we get stuck
      
      The solution is to check the new page table entry (as would be done by
      the generic, non-atomic definition of ptep_set_access_flags that just
      calls set_pte_at) to establish the dirty state.
      
      Cc: <stable@vger.kernel.org> # 4.3+
      Fixes: 66dbd6e6 ("arm64: Implement ptep_set_access_flags() for hardware AF/DBM")
      Reviewed-by: NCatalin Marinas <catalin.marinas@arm.com>
      Reported-by: NAlexander Graf <agraf@suse.de>
      Tested-by: NAlexander Graf <agraf@suse.de>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      0106d456
  6. 04 6月, 2016 1 次提交
  7. 03 6月, 2016 6 次提交
    • M
      arm64: fix alignment when RANDOMIZE_TEXT_OFFSET is enabled · aed7eb83
      Mark Rutland 提交于
      With ARM64_64K_PAGES and RANDOMIZE_TEXT_OFFSET enabled, we hit the
      following issue on the boot:
      
      kernel BUG at arch/arm64/mm/mmu.c:480!
      Internal error: Oops - BUG: 0 [#1] PREEMPT SMP
      Modules linked in:
      CPU: 0 PID: 0 Comm: swapper Not tainted 4.6.0 #310
      Hardware name: ARM Juno development board (r2) (DT)
      task: ffff000008d58a80 ti: ffff000008d30000 task.ti: ffff000008d30000
      PC is at map_kernel_segment+0x44/0xb0
      LR is at paging_init+0x84/0x5b0
      pc : [<ffff000008c450b4>] lr : [<ffff000008c451a4>] pstate: 600002c5
      
      Call trace:
      [<ffff000008c450b4>] map_kernel_segment+0x44/0xb0
      [<ffff000008c451a4>] paging_init+0x84/0x5b0
      [<ffff000008c42728>] setup_arch+0x198/0x534
      [<ffff000008c40848>] start_kernel+0x70/0x388
      [<ffff000008c401bc>] __primary_switched+0x30/0x74
      
      Commit 7eb90f2f ("arm64: cover the .head.text section in the .text
      segment mapping") removed the alignment between the .head.text and .text
      sections, and used the _text rather than the _stext interval for mapping
      the .text segment.
      
      Prior to this commit _stext was always section aligned and didn't cause
      any issue even when RANDOMIZE_TEXT_OFFSET was enabled. Since that
      alignment has been removed and _text is used to map the .text segment,
      we need ensure _text is always page aligned when RANDOMIZE_TEXT_OFFSET
      is enabled.
      
      This patch adds logic to TEXT_OFFSET fuzzing to ensure that the offset
      is always aligned to the kernel page size. To ensure this, we rely on
      the PAGE_SHIFT being available via Kconfig.
      Signed-off-by: NMark Rutland <mark.rutland@arm.com>
      Reported-by: NSudeep Holla <sudeep.holla@arm.com>
      Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Fixes: 7eb90f2f ("arm64: cover the .head.text section in the .text segment mapping")
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      aed7eb83
    • M
      arm64: move {PAGE,CONT}_SHIFT into Kconfig · 030c4d24
      Mark Rutland 提交于
      In some cases (e.g. the awk for CONFIG_RANDOMIZE_TEXT_OFFSET) we would
      like to make use of PAGE_SHIFT outside of code that can include the
      usual header files.
      
      Add a new CONFIG_ARM64_PAGE_SHIFT for this, likewise with
      ARM64_CONT_SHIFT for consistency.
      Signed-off-by: NMark Rutland <mark.rutland@arm.com>
      Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Sudeep Holla <sudeep.holla@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      030c4d24
    • M
      arm64: mm: dump: log span level · 48dd73c5
      Mark Rutland 提交于
      The page table dump code logs spans of entries at the same level
      (pgd/pud/pmd/pte) which have the same attributes. While we log the
      (decoded) attributes, we don't log the level, which leaves the output
      ambiguous and/or confusing in some cases.
      
      For example:
      
      0xffff800800000000-0xffff800980000000           6G       RW NX SHD AF        BLK UXN MEM/NORMAL
      
      If using 4K pages, this may describe a span of 6 1G block entries at the
      PGD/PUD level, or 3072 2M block entries at the PMD level.
      
      This patch adds the page table level to each output line, removing this
      ambiguity. For the example above, this will produce:
      
      0xffffffc800000000-0xffffffc980000000           6G PUD       RW NX SHD AF        BLK UXN MEM/NORMAL
      
      When 3 level tables are in use, and we use the asm-generic/nopud.h
      definitions, the dump code treats each entry in the PGD as a 1 element
      table at the PUD level, and logs spans as being PUDs, which can be
      confusing. To counteract this, the "PUD" mnemonic is replaced with "PGD"
      when CONFIG_PGTABLE_LEVELS <= 3. Likewise for "PMD" when
      CONFIG_PGTABLE_LEVELS <= 2.
      Signed-off-by: NMark Rutland <mark.rutland@arm.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Huang Shijie <shijie.huang@arm.com>
      Cc: Laura Abbott <labbott@fedoraproject.org>
      Cc: Steve Capper <steve.capper@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      48dd73c5
    • M
      arm64: update stale PAGE_OFFSET comment · a13e3a5b
      Mark Rutland 提交于
      Commit ab893fb9 ("arm64: introduce KIMAGE_VADDR as the virtual
      base of the kernel region") logically split KIMAGE_VADDR from
      PAGE_OFFSET, and since commit f9040773 ("arm64: move kernel
      image to base of vmalloc area") the two have been distinct values.
      
      Unfortunately, neither commit updated the comment above these
      definitions, which now erroneously states that PAGE_OFFSET is the start
      of the kernel image rather than the start of the linear mapping.
      
      This patch fixes said comment, and introduces an explanation of
      KIMAGE_VADDR.
      Signed-off-by: NMark Rutland <mark.rutland@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Marc Zyngier <marc.zyngier@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      a13e3a5b
    • M
      arm64: report CPU number in bad_mode · 8051f4d1
      Mark Rutland 提交于
      If we take an exception we don't expect (e.g. SError), we report this in
      the bad_mode handler with pr_crit. Depending on the configured log
      level, we may or may not log additional information in functions called
      subsequently. Notably, the messages in dump_stack (including the CPU
      number) are printed with KERN_DEFAULT and may not appear.
      
      Some exceptions have an IMPLEMENTATION DEFINED ESR_ELx.ISS encoding, and
      knowing the CPU number is crucial to correctly decode them. To ensure
      that this is always possible, we should log the CPU number along with
      the ESR_ELx value, so we are not reliant on subsequent logs or
      additional printk configuration options.
      
      This patch logs the CPU number in bad_mode such that it is possible for
      a developer to decode these exceptions, provided access to sufficient
      documentation.
      Signed-off-by: NMark Rutland <mark.rutland@arm.com>
      Reported-by: NAl Grant <Al.Grant@arm.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Dave Martin <dave.martin@arm.com>
      Cc: Robin Murphy <robin.murphy@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      8051f4d1
    • G
      irqchip/gicv3-its: numa: Enable workaround for Cavium thunderx erratum 23144 · fbf8f40e
      Ganapatrao Kulkarni 提交于
      The erratum fixes the hang of ITS SYNC command by avoiding inter node
      io and collections/cpu mapping on thunderx dual-socket platform.
      
      This fix is only applicable for Cavium's ThunderX dual-socket platform.
      Reviewed-by: NRobert Richter <rrichter@cavium.com>
      Signed-off-by: NGanapatrao Kulkarni <gkulkarni@caviumnetworks.com>
      Signed-off-by: NRobert Richter <rrichter@cavium.com>
      Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
      fbf8f40e
  8. 02 6月, 2016 1 次提交
  9. 01 6月, 2016 1 次提交
  10. 31 5月, 2016 7 次提交
  11. 24 5月, 2016 1 次提交
  12. 21 5月, 2016 1 次提交
    • J
      exit_thread: remove empty bodies · 5f56a5df
      Jiri Slaby 提交于
      Define HAVE_EXIT_THREAD for archs which want to do something in
      exit_thread. For others, let's define exit_thread as an empty inline.
      
      This is a cleanup before we change the prototype of exit_thread to
      accept a task parameter.
      
      [akpm@linux-foundation.org: fix mips]
      Signed-off-by: NJiri Slaby <jslaby@suse.cz>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: "James E.J. Bottomley" <jejb@parisc-linux.org>
      Cc: Aurelien Jacquiot <a-jacquiot@ti.com>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Chen Liqin <liqin.linux@gmail.com>
      Cc: Chris Metcalf <cmetcalf@mellanox.com>
      Cc: Chris Zankel <chris@zankel.net>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: Guan Xuetao <gxt@mprc.pku.edu.cn>
      Cc: Haavard Skinnemoen <hskinnemoen@gmail.com>
      Cc: Hans-Christian Egtvedt <egtvedt@samfundet.no>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Helge Deller <deller@gmx.de>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
      Cc: James Hogan <james.hogan@imgtec.com>
      Cc: Jeff Dike <jdike@addtoit.com>
      Cc: Jesper Nilsson <jesper.nilsson@axis.com>
      Cc: Jiri Slaby <jslaby@suse.cz>
      Cc: Jonas Bonn <jonas@southpole.se>
      Cc: Koichi Yasutake <yasutake.koichi@jp.panasonic.com>
      Cc: Lennox Wu <lennox.wu@gmail.com>
      Cc: Ley Foon Tan <lftan@altera.com>
      Cc: Mark Salter <msalter@redhat.com>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Matt Turner <mattst88@gmail.com>
      Cc: Max Filippov <jcmvbkbc@gmail.com>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Michal Simek <monstr@monstr.eu>
      Cc: Mikael Starvik <starvik@axis.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Rich Felker <dalias@libc.org>
      Cc: Richard Henderson <rth@twiddle.net>
      Cc: Richard Kuo <rkuo@codeaurora.org>
      Cc: Richard Weinberger <richard@nod.at>
      Cc: Russell King <linux@arm.linux.org.uk>
      Cc: Steven Miao <realmz6@gmail.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Vineet Gupta <vgupta@synopsys.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      5f56a5df
  13. 20 5月, 2016 7 次提交
  14. 18 5月, 2016 1 次提交
  15. 17 5月, 2016 1 次提交
    • A
      perf core: Add a 'nr' field to perf_event_callchain_context · 3b1fff08
      Arnaldo Carvalho de Melo 提交于
      We will use it to count how many addresses are in the entry->ip[] array,
      excluding PERF_CONTEXT_{KERNEL,USER,etc} entries, so that we can really
      return the number of entries specified by the user via the relevant
      sysctl, kernel.perf_event_max_contexts, or via the per event
      perf_event_attr.sample_max_stack knob.
      
      This way we keep the perf_sample->ip_callchain->nr meaning, that is the
      number of entries, be it real addresses or PERF_CONTEXT_ entries, while
      honouring the max_stack knobs, i.e. the end result will be max_stack
      entries if we have at least that many entries in a given stack trace.
      
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/n/tip-s8teto51tdqvlfhefndtat9r@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      3b1fff08