1. 11 4月, 2015 9 次提交
  2. 10 4月, 2015 2 次提交
    • M
      powerpc: Reword the "returning from prom_init" message · 7e862d7e
      Michael Ellerman 提交于
      We get way too many bug reports that say "the kernel is hung in
      prom_init", which stems from the fact that the last piece of output
      people see is "returning from prom_init".
      
      The kernel is almost never hung in prom_init(), it's just that it's
      crashed somewhere after prom_init() but prior to the console coming up.
      
      The existing message should give a clue to that, ie. "returning from"
      indicates that prom_init() has finished, but it doesn't seem to work.
      Let's try something different.
      
      This prints:
      
        Quiescing Open Firmware ...
        Booting Linux via __start() ...
      
      Which hopefully makes it clear that prom_init() is not the problem, and
      although __start() probably isn't either, it's at least the right place
      to begin looking.
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      Wistfully-Acked-by: NJeremy Kerr <jk@ozlabs.org>
      7e862d7e
    • M
      powerpc: Replace mem_init_done with slab_is_available() · f691fa10
      Michael Ellerman 提交于
      We have a powerpc specific global called mem_init_done which is "set on
      boot once kmalloc can be called".
      
      But that's not *quite* true. We set it at the bottom of mem_init(), and
      rely on the fact that mm_init() calls kmem_cache_init() immediately
      after that, and nothing is running in parallel.
      
      So replace it with the generic and 100% correct slab_is_available().
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      f691fa10
  3. 07 4月, 2015 1 次提交
  4. 31 3月, 2015 1 次提交
    • G
      powerpc/eeh: Fix PE#0 check in eeh_add_to_parent_pe() · 433185d2
      Gavin Shan 提交于
      The function eeh_add_parent_pe() is used to create a PE or add one
      edev to its parent PE. Current code checks if PE#0 is valid for the
      later case. Actually, we should validate PE#0 for both cases when
      EEH core regards PE#0 as invalid one (without flag EEH_VALID_PE_ZERO).
      Otherwise, not all EEH devices can be added to its parent PE#0 for
      EEH on P7IOC.
      
      The patch fixes the issue by validating PE#0 for the two cases. So far,
      we don't have PE#0 for EEH on P7IOC, but it will show up when we enable
      M64 for P7IOC. The patch also makes the error message more meaningful.
      Signed-off-by: NGavin Shan <gwshan@linux.vnet.ibm.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      433185d2
  5. 28 3月, 2015 2 次提交
    • M
      powerpc: Add a proper syscall for switching endianness · 529d235a
      Michael Ellerman 提交于
      We currently have a "special" syscall for switching endianness. This is
      syscall number 0x1ebe, which is handled explicitly in the 64-bit syscall
      exception entry.
      
      That has a few problems, firstly the syscall number is outside of the
      usual range, which confuses various tools. For example strace doesn't
      recognise the syscall at all.
      
      Secondly it's handled explicitly as a special case in the syscall
      exception entry, which is complicated enough without it.
      
      As a first step toward removing the special syscall, we need to add a
      regular syscall that implements the same functionality.
      
      The logic is simple, it simply toggles the MSR_LE bit in the userspace
      MSR. This is the same as the special syscall, with the caveat that the
      special syscall clobbers fewer registers.
      
      This version clobbers r9-r12, XER, CTR, and CR0-1,5-7.
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      529d235a
    • T
      powerpc/pseries: Simplify check for suspendability during suspend/migration · c03e7374
      Tyrel Datwyler 提交于
      During suspend/migration operation we must wait for the VASI state reported
      by the hypervisor to become Suspending prior to making the ibm,suspend-me
      RTAS call. Calling routines to rtas_ibm_supend_me() pass a vasi_state variable
      that exposes the VASI state to the caller. This is unnecessary as the caller
      only really cares about the following three conditions; if there is an error
      we should bailout, success indicating we have suspended and woken back up so
      proceed to device tree update, or we are not suspendable yet so try calling
      rtas_ibm_suspend_me again shortly.
      
      This patch removes the extraneous vasi_state variable and simply uses the
      return code to communicate how to proceed. We either succeed, fail, or get
      -EAGAIN in which case we sleep for a second before trying to call
      rtas_ibm_suspend_me again. The behaviour of ppc_rtas() remains the same,
      but migrate_store() now returns the propogated error code on failure.
      Previously -1 was returned from migrate_store() in the  failure case which
      equates to -EPERM and was clearly wrong.
      Signed-off-by: NTyrel Datwyler <tyreld@linux.vnet.ibm.com>
      Cc: Nathan Fontenont <nfont@linux.vnet.ibm.com>
      Cc: Cyril Bur <cyrilbur@gmail.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      c03e7374
  6. 24 3月, 2015 6 次提交
  7. 23 3月, 2015 4 次提交
  8. 20 3月, 2015 1 次提交
  9. 17 3月, 2015 3 次提交
  10. 16 3月, 2015 1 次提交
  11. 04 3月, 2015 2 次提交
    • N
      powerpc/iommu: Remove IOMMU device references via bus notifier · 4ad04e59
      Nishanth Aravamudan 提交于
      After d905c5df ("PPC: POWERNV: move iommu_add_device earlier"), the
      refcnt on the kobject backing the IOMMU group for a PCI device is
      elevated by each call to pci_dma_dev_setup_pSeriesLP() (via
      set_iommu_table_base_and_group). When we go to dlpar a multi-function
      PCI device out:
      
              iommu_reconfig_notifier ->
                      iommu_free_table ->
                              iommu_group_put
                              BUG_ON(tbl->it_group)
      
      We trip this BUG_ON, because there are still references on the table, so
      it is not freed. Fix this by moving the powernv bus notifier to common
      code and calling it for both powernv and pseries.
      
      Fixes: d905c5df ("PPC: POWERNV: move iommu_add_device earlier")
      Signed-off-by: NNishanth Aravamudan <nacc@linux.vnet.ibm.com>
      Tested-by: NNishanth Aravamudan <nacc@linux.vnet.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      4ad04e59
    • M
      powerpc/smp: Wait until secondaries are active & online · 875ebe94
      Michael Ellerman 提交于
      Anton has a busy ppc64le KVM box where guests sometimes hit the infamous
      "kernel BUG at kernel/smpboot.c:134!" issue during boot:
      
        BUG_ON(td->cpu != smp_processor_id());
      
      Basically a per CPU hotplug thread scheduled on the wrong CPU. The oops
      output confirms it:
      
        CPU: 0
        Comm: watchdog/130
      
      The problem is that we aren't ensuring the CPU active bit is set for the
      secondary before allowing the master to continue on. The master unparks
      the secondary CPU's kthreads and the scheduler looks for a CPU to run
      on. It calls select_task_rq() and realises the suggested CPU is not in
      the cpus_allowed mask. It then ends up in select_fallback_rq(), and
      since the active bit isnt't set we choose some other CPU to run on.
      
      This seems to have been introduced by 6acbfb96 "sched: Fix hotplug
      vs. set_cpus_allowed_ptr()", which changed from setting active before
      online to setting active after online. However that was in turn fixing a
      bug where other code assumed an active CPU was also online, so we can't
      just revert that fix.
      
      The simplest fix is just to spin waiting for both active & online to be
      set. We already have a barrier prior to set_cpu_online() (which also
      sets active), to ensure all other setup is completed before online &
      active are set.
      
      Fixes: 6acbfb96 ("sched: Fix hotplug vs. set_cpus_allowed_ptr()")
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      Signed-off-by: NAnton Blanchard <anton@samba.org>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      875ebe94
  12. 18 2月, 2015 1 次提交
  13. 14 2月, 2015 1 次提交
  14. 13 2月, 2015 2 次提交
    • C
      powerpc: add running_clock for powerpc to prevent spurious softlockup warnings · 4be1b297
      Cyril Bur 提交于
      On POWER8 virtualised kernels the VTB register can be read to have a view
      of time that only increases while the guest is running.  This will prevent
      guests from seeing time jump if a guest is paused for significant amounts
      of time.
      
      On POWER7 and below virtualised kernels stolen time is subtracted from
      local_clock as a best effort approximation.  This will not eliminate
      spurious warnings in the case of a suspended guest but may reduce the
      occurance in the case of softlockups due to host over commit.
      
      Bare metal kernels should avoid reading the VTB as KVM does not restore
      sane values when not executing, the approxmation is fine as host kernels
      won't observe any stolen time.
      
      [akpm@linux-foundation.org: coding-style fixes]
      Signed-off-by: NCyril Bur <cyrilbur@gmail.com>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Andrew Jones <drjones@redhat.com>
      Acked-by: NDon Zickus <dzickus@redhat.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Ulrich Obergfell <uobergfe@redhat.com>
      Cc: chai wen <chaiw.fnst@cn.fujitsu.com>
      Cc: Fabian Frederick <fabf@skynet.be>
      Cc: Aaron Tomlin <atomlin@redhat.com>
      Cc: Ben Zhang <benzh@chromium.org>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: John Stultz <john.stultz@linaro.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      4be1b297
    • A
      all arches, signal: move restart_block to struct task_struct · f56141e3
      Andy Lutomirski 提交于
      If an attacker can cause a controlled kernel stack overflow, overwriting
      the restart block is a very juicy exploit target.  This is because the
      restart_block is held in the same memory allocation as the kernel stack.
      
      Moving the restart block to struct task_struct prevents this exploit by
      making the restart_block harder to locate.
      
      Note that there are other fields in thread_info that are also easy
      targets, at least on some architectures.
      
      It's also a decent simplification, since the restart code is more or less
      identical on all architectures.
      
      [james.hogan@imgtec.com: metag: align thread_info::supervisor_stack]
      Signed-off-by: NAndy Lutomirski <luto@amacapital.net>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: David Miller <davem@davemloft.net>
      Acked-by: NRichard Weinberger <richard@nod.at>
      Cc: Richard Henderson <rth@twiddle.net>
      Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
      Cc: Matt Turner <mattst88@gmail.com>
      Cc: Vineet Gupta <vgupta@synopsys.com>
      Cc: Russell King <rmk@arm.linux.org.uk>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Haavard Skinnemoen <hskinnemoen@gmail.com>
      Cc: Hans-Christian Egtvedt <egtvedt@samfundet.no>
      Cc: Steven Miao <realmz6@gmail.com>
      Cc: Mark Salter <msalter@redhat.com>
      Cc: Aurelien Jacquiot <a-jacquiot@ti.com>
      Cc: Mikael Starvik <starvik@axis.com>
      Cc: Jesper Nilsson <jesper.nilsson@axis.com>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Richard Kuo <rkuo@codeaurora.org>
      Cc: "Luck, Tony" <tony.luck@intel.com>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: Michal Simek <monstr@monstr.eu>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Jonas Bonn <jonas@southpole.se>
      Cc: "James E.J. Bottomley" <jejb@parisc-linux.org>
      Cc: Helge Deller <deller@gmx.de>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Acked-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc)
      Tested-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc)
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Chen Liqin <liqin.linux@gmail.com>
      Cc: Lennox Wu <lennox.wu@gmail.com>
      Cc: Chris Metcalf <cmetcalf@ezchip.com>
      Cc: Guan Xuetao <gxt@mprc.pku.edu.cn>
      Cc: Chris Zankel <chris@zankel.net>
      Cc: Max Filippov <jcmvbkbc@gmail.com>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Guenter Roeck <linux@roeck-us.net>
      Signed-off-by: NJames Hogan <james.hogan@imgtec.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      f56141e3
  15. 02 2月, 2015 3 次提交
    • G
      powerpc/kernel: Avoid initializing device-tree pointer twice · fe12545e
      Gavin Shan 提交于
      As commit 50ba08f3 ("of/fdt: Don't clear initial_boot_params
      if fdt_check_header() fails") does, the device-tree pointer
      "initial_boot_params" is initialized by early_init_dt_verify(),
      which is called by early_init_devtree(). So we needn't explicitly
      initialize that again in early_init_devtree().
      Signed-off-by: NGavin Shan <gwshan@linux.vnet.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      fe12545e
    • M
      powerpc: Remove old compile time disabled syscall tracing code · a4bcbe6a
      Michael Ellerman 提交于
      We have code to do syscall tracing which is disabled at compile time by
      default. It's not been touched since the dawn of time (ie. v2.6.12).
      
      There are now better ways to do syscall tracing, ie. using the
      raw_syscall, or syscall tracepoints.
      
      For the specific case of tracing syscalls at boot on a system that
      doesn't get to userspace, you can boot with:
      
        trace_event=syscalls tp_printk=on
      
      Which will trace syscalls from boot, and echo all output to the console.
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      a4bcbe6a
    • M
      powerpc/kernel: Make syscall_exit a local label · 4c3b2168
      Michael Ellerman 提交于
      Currently when we back trace something that is in a syscall we see
      something like this:
      
      [c000000000000000] [c000000000000000] SyS_read+0x6c/0x110
      [c000000000000000] [c000000000000000] syscall_exit+0x0/0x98
      
      Although it's entirely correct, seeing syscall_exit at the bottom can be
      confusing - we were exiting from a syscall and then called SyS_read() ?
      
      If we instead change syscall_exit to be a local label we get something
      more intuitive:
      
      [c0000001fa46fde0] [c00000000026719c] SyS_read+0x6c/0x110
      [c0000001fa46fe30] [c000000000009264] system_call+0x38/0xd0
      
      ie. we were handling a system call, and it was SyS_read().
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      4c3b2168
  16. 30 1月, 2015 1 次提交