1. 01 7月, 2013 2 次提交
    • L
      powerpc: Set cpu sibling mask before online cpu · cce606fe
      Li Zhong 提交于
      It seems following race is possible:
      
      	cpu0					cpux
      smp_init->cpu_up->_cpu_up
      	__cpu_up
      		kick_cpu(1)
      -------------------------------------------------------------------------
      		waiting online			...
      		...				notify CPU_STARTING
      							set cpux active
      						set cpux online
      -------------------------------------------------------------------------
      		finish waiting online
      		...
      sched_init_smp
      	init_sched_domains(cpu_active_mask)
      		build_sched_domains
      						set cpux sibling info
      -------------------------------------------------------------------------
      
      Execution of cpu0 and cpux could be concurrent between two separator
      lines.
      
      So if the cpux sibling information was set too late (normally
      impossible, but could be triggered by adding some delay in
      start_secondary, after setting cpu online), build_sched_domains()
      running on cpu0 might see cpux active, with an empty sibling mask, then
      cause some bad address accessing like following:
      
      [    0.099855] Unable to handle kernel paging request for data at address 0xc00000038518078f
      [    0.099868] Faulting instruction address: 0xc0000000000b7a64
      [    0.099883] Oops: Kernel access of bad area, sig: 11 [#1]
      [    0.099895] PREEMPT SMP NR_CPUS=16 DEBUG_PAGEALLOC NUMA pSeries
      [    0.099922] Modules linked in:
      [    0.099940] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 3.10.0-rc1-00120-gb973425c-dirty #16
      [    0.099956] task: c0000001fed80000 ti: c0000001fed7c000 task.ti: c0000001fed7c000
      [    0.099971] NIP: c0000000000b7a64 LR: c0000000000b7a40 CTR: c0000000000b4934
      [    0.099985] REGS: c0000001fed7f760 TRAP: 0300   Not tainted  (3.10.0-rc1-00120-gb973425c-dirty)
      [    0.099997] MSR: 8000000000009032 <SF,EE,ME,IR,DR,RI>  CR: 24272828  XER: 20000003
      [    0.100045] SOFTE: 1
      [    0.100053] CFAR: c000000000445ee8
      [    0.100064] DAR: c00000038518078f, DSISR: 40000000
      [    0.100073]
      GPR00: 0000000000000080 c0000001fed7f9e0 c000000000c84d48 0000000000000010
      GPR04: 0000000000000010 0000000000000000 c0000001fc55e090 0000000000000000
      GPR08: ffffffffffffffff c000000000b80b30 c000000000c962d8 00000003845ffc5f
      GPR12: 0000000000000000 c00000000f33d000 c00000000000b9e4 0000000000000000
      GPR16: 0000000000000000 0000000000000000 0000000000000001 0000000000000000
      GPR20: c000000000ccf750 0000000000000000 c000000000c94d48 c0000001fc504000
      GPR24: c0000001fc504000 c0000001fecef848 c000000000c94d48 c000000000ccf000
      GPR28: c0000001fc522090 0000000000000010 c0000001fecef848 c0000001fed7fae0
      [    0.100293] NIP [c0000000000b7a64] .get_group+0x84/0xc4
      [    0.100307] LR [c0000000000b7a40] .get_group+0x60/0xc4
      [    0.100318] Call Trace:
      [    0.100332] [c0000001fed7f9e0] [c0000000000dbce4] .lock_is_held+0xa8/0xd0 (unreliable)
      [    0.100354] [c0000001fed7fa70] [c0000000000bf62c] .build_sched_domains+0x728/0xd14
      [    0.100375] [c0000001fed7fbe0] [c000000000af67bc] .sched_init_smp+0x4fc/0x654
      [    0.100394] [c0000001fed7fce0] [c000000000adce24] .kernel_init_freeable+0x17c/0x30c
      [    0.100413] [c0000001fed7fdb0] [c00000000000ba08] .kernel_init+0x24/0x12c
      [    0.100431] [c0000001fed7fe30] [c000000000009f74] .ret_from_kernel_thread+0x5c/0x68
      [    0.100445] Instruction dump:
      [    0.100456] 38800010 38a00000 4838e3f5 60000000 7c6307b4 2fbf0000 419e0040 3d220001
      [    0.100496] 78601f24 39491590 e93e0008 7d6a002a <7d69582a> f97f0000 7d4a002a e93e0010
      [    0.100559] ---[ end trace 31fd0ba7d8756001 ]---
      
      This patch tries to move the sibling maps updating before
      notify_cpu_starting() and cpu online, and a write barrier there to make
      sure sibling maps are updated before active and online mask.
      Signed-off-by: NLi Zhong <zhong@linux.vnet.ibm.com>
      Reviewed-by: NSrivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      cce606fe
    • P
      powerpc: Delete __cpuinit usage from all users · 061d19f2
      Paul Gortmaker 提交于
      The __cpuinit type of throwaway sections might have made sense
      some time ago when RAM was more constrained, but now the savings
      do not offset the cost and complications.  For example, the fix in
      commit 5e427ec2 ("x86: Fix bit corruption at CPU resume time")
      is a good example of the nasty type of bugs that can be created
      with improper use of the various __init prefixes.
      
      After a discussion on LKML[1] it was decided that cpuinit should go
      the way of devinit and be phased out.  Once all the users are gone,
      we can then finally remove the macros themselves from linux/init.h.
      
      This removes all the powerpc uses of the __cpuinit macros.  There
      are no __CPUINIT users in assembly files in powerpc.
      
      [1] https://lkml.org/lkml/2013/5/20/589
      
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Josh Boyer <jwboyer@gmail.com>
      Cc: Matt Porter <mporter@kernel.crashing.org>
      Cc: Kumar Gala <galak@kernel.crashing.org>
      Cc: linuxppc-dev@lists.ozlabs.org
      Signed-off-by: NPaul Gortmaker <paul.gortmaker@windriver.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      061d19f2
  2. 08 4月, 2013 1 次提交
  3. 08 2月, 2013 1 次提交
    • D
      powerpc: fix ics_rtas_init and start_secondary section mismatch · 174ea471
      Daniel Borkmann 提交于
      It seems, we're fine with just annotating the two functions.
      Thus, this fixes the following build warnings on ppc64:
      
      WARNING: arch/powerpc/sysdev/xics/built-in.o(.text+0x1664):
      The function .ics_rtas_init() references
      the function __init .xics_register_ics().
      This is often because .ics_rtas_init lacks a __init
      annotation or the annotation of .xics_register_ics is wrong.
      
      WARNING: arch/powerpc/sysdev/built-in.o(.text+0x6044):
      The function .ics_rtas_init() references
      the function __init .xics_register_ics().
      This is often because .ics_rtas_init lacks a __init
      annotation or the annotation of .xics_register_ics is wrong.
      
      WARNING: arch/powerpc/kernel/built-in.o(.text+0x2db30):
      The function .start_secondary() references
      the function __cpuinit .vdso_getcpu_init().
      This is often because .start_secondary lacks a __cpuinit
      annotation or the annotation of .vdso_getcpu_init is wrong.
      
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Signed-off-by: NDaniel Borkmann <dborkman@redhat.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      174ea471
  4. 04 1月, 2013 1 次提交
    • G
      POWERPC: drivers: remove __dev* attributes. · cad5cef6
      Greg Kroah-Hartman 提交于
      CONFIG_HOTPLUG is going away as an option.  As a result, the __dev*
      markings need to be removed.
      
      This change removes the use of __devinit, __devexit_p, __devinitdata,
      __devinitconst, and __devexit from these drivers.
      
      Based on patches originally written by Bill Pemberton, but redone by me
      in order to handle some of the coding style issues better, by hand.
      
      Cc: Bill Pemberton <wfp5p@virginia.edu>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      cad5cef6
  5. 30 10月, 2012 1 次提交
    • P
      KVM: PPC: Book3S HV: Allow KVM guests to stop secondary threads coming online · 512691d4
      Paul Mackerras 提交于
      When a Book3S HV KVM guest is running, we need the host to be in
      single-thread mode, that is, all of the cores (or at least all of
      the cores where the KVM guest could run) to be running only one
      active hardware thread.  This is because of the hardware restriction
      in POWER processors that all of the hardware threads in the core
      must be in the same logical partition.  Complying with this restriction
      is much easier if, from the host kernel's point of view, only one
      hardware thread is active.
      
      This adds two hooks in the SMP hotplug code to allow the KVM code to
      make sure that secondary threads (i.e. hardware threads other than
      thread 0) cannot come online while any KVM guest exists.  The KVM
      code still has to check that any core where it runs a guest has the
      secondary threads offline, but having done that check it can now be
      sure that they will not come online while the guest is running.
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      Acked-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      512691d4
  6. 19 9月, 2012 1 次提交
  7. 13 9月, 2012 1 次提交
  8. 05 9月, 2012 1 次提交
    • P
      powerpc: Make sure IPI handlers see data written by IPI senders · 9fb1b36c
      Paul Mackerras 提交于
      We have been observing hangs, both of KVM guest vcpu tasks and more
      generally, where a process that is woken doesn't properly wake up and
      continue to run, but instead sticks in TASK_WAKING state.  This
      happens because the update of rq->wake_list in ttwu_queue_remote()
      is not ordered with the update of ipi_message in
      smp_muxed_ipi_message_pass(), and the reading of rq->wake_list in
      scheduler_ipi() is not ordered with the reading of ipi_message in
      smp_ipi_demux().  Thus it is possible for the IPI receiver not to see
      the updated rq->wake_list and therefore conclude that there is nothing
      for it to do.
      
      In order to make sure that anything done before smp_send_reschedule()
      is ordered before anything done in the resulting call to scheduler_ipi(),
      this adds barriers in smp_muxed_message_pass() and smp_ipi_demux().
      The barrier in smp_muxed_message_pass() is a full barrier to ensure that
      there is a full ordering between the smp_send_reschedule() caller and
      scheduler_ipi().  In smp_ipi_demux(), we use xchg() rather than
      xchg_local() because xchg() includes release and acquire barriers.
      Using xchg() rather than xchg_local() makes sense given that
      ipi_message is not just accessed locally.
      
      This moves the barrier between setting the message and calling the
      cause_ipi() function into the individual cause_ipi implementations.
      Most of them -- those that used outb, out_8 or similar -- already had
      a full barrier because out_8 etc. include a sync before the MMIO
      store.  This adds an explicit barrier in the two remaining cases.
      
      These changes made no measurable difference to the speed of IPIs as
      measured using a simple ping-pong latency test across two CPUs on
      different cores of a POWER7 machine.
      
      The analysis of the reason why processes were not waking up properly
      is due to Milton Miller.
      
      Cc: stable@vger.kernel.org # v3.0+
      Reported-by: NMilton Miller <miltonm@bga.com>
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      9fb1b36c
  9. 11 7月, 2012 1 次提交
    • A
      powerpc: Add VDSO version of getcpu · 18ad51dd
      Anton Blanchard 提交于
      We have a request for a fast method of getting CPU and NUMA node IDs
      from userspace. This patch implements a getcpu VDSO function,
      similar to x86.
      
      Ben suggested we use SPRG3 which is userspace readable. SPRG3 can be
      modified by a KVM guest, so we save the SPRG3 value in the paca and
      restore it when transitioning from the guest to the host.
      
      I have a glibc patch that implements sched_getcpu on top of this.
      Testing on a POWER7:
      
      baseline: 538 cycles
      vdso:      30 cycles
      Signed-off-by: NAnton Blanchard <anton@samba.org>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      18ad51dd
  10. 03 7月, 2012 1 次提交
    • Y
      powerpc/smp: remove call to ipi_call_lock()/ipi_call_unlock() · e250d4bc
      Yong Zhang 提交于
      1) call_function.lock used in smp_call_function_many() is just to protect
         call_function.queue and &data->refs, cpu_online_mask is outside of the
         lock. And it's not necessary to protect cpu_online_mask,
         because data->cpumask is pre-calculate and even if a cpu is brougt up
         when calling arch_send_call_function_ipi_mask(), it's harmless because
         validation test in generic_smp_call_function_interrupt() will take care
         of it.
      
      2) For cpu down issue, stop_machine() will guarantee that no concurrent
         smp_call_fuction() is processing.
      Signed-off-by: NYong Zhang <yong.zhang0@gmail.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      e250d4bc
  11. 05 6月, 2012 1 次提交
  12. 26 4月, 2012 2 次提交
    • T
      powerpc: Use generic idle thread allocation · 17e32eac
      Thomas Gleixner 提交于
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Link: http://lkml.kernel.org/r/20120420124557.311212868@linutronix.de
      17e32eac
    • T
      smp: Add task_struct argument to __cpu_up() · 8239c25f
      Thomas Gleixner 提交于
      Preparatory patch to make the idle thread allocation for secondary
      cpus generic.
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
      Cc: Matt Turner <mattst88@gmail.com>
      Cc: Russell King <linux@arm.linux.org.uk>
      Cc: Mike Frysinger <vapier@gentoo.org>
      Cc: Jesper Nilsson <jesper.nilsson@axis.com>
      Cc: Richard Kuo <rkuo@codeaurora.org>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Hirokazu Takata <takata@linux-m32r.org>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: David Howells <dhowells@redhat.com>
      Cc: James E.J. Bottomley <jejb@parisc-linux.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Paul Mundt <lethal@linux-sh.org>
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Chris Metcalf <cmetcalf@tilera.com>
      Cc: Richard Weinberger <richard@nod.at>
      Cc: x86@kernel.org
      Link: http://lkml.kernel.org/r/20120420124556.964170564@linutronix.de
      8239c25f
  13. 29 3月, 2012 1 次提交
  14. 22 12月, 2011 1 次提交
    • K
      cpu: convert 'cpu' and 'machinecheck' sysdev_class to a regular subsystem · 8a25a2fd
      Kay Sievers 提交于
      This moves the 'cpu sysdev_class' over to a regular 'cpu' subsystem
      and converts the devices to regular devices. The sysdev drivers are
      implemented as subsystem interfaces now.
      
      After all sysdev classes are ported to regular driver core entities, the
      sysdev implementation will be entirely removed from the kernel.
      
      Userspace relies on events and generic sysfs subsystem infrastructure
      from sysdev devices, which are made available with this conversion.
      
      Cc: Haavard Skinnemoen <hskinnemoen@gmail.com>
      Cc: Hans-Christian Egtvedt <egtvedt@samfundet.no>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Paul Mundt <lethal@linux-sh.org>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Chris Metcalf <cmetcalf@tilera.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Borislav Petkov <bp@amd64.org>
      Cc: Tigran Aivazian <tigran@aivazian.fsnet.co.uk>
      Cc: Len Brown <lenb@kernel.org>
      Cc: Zhang Rui <rui.zhang@intel.com>
      Cc: Dave Jones <davej@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Russell King <rmk+kernel@arm.linux.org.uk>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Arjan van de Ven <arjan@linux.intel.com>
      Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
      Cc: "Srivatsa S. Bhat" <srivatsa.bhat@linux.vnet.ibm.com>
      Signed-off-by: NKay Sievers <kay.sievers@vrfy.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@suse.de>
      8a25a2fd
  15. 25 11月, 2011 1 次提交
  16. 08 11月, 2011 1 次提交
  17. 01 11月, 2011 1 次提交
  18. 20 9月, 2011 1 次提交
  19. 27 7月, 2011 1 次提交
  20. 12 7月, 2011 1 次提交
    • P
      KVM: PPC: Add support for Book3S processors in hypervisor mode · de56a948
      Paul Mackerras 提交于
      This adds support for KVM running on 64-bit Book 3S processors,
      specifically POWER7, in hypervisor mode.  Using hypervisor mode means
      that the guest can use the processor's supervisor mode.  That means
      that the guest can execute privileged instructions and access privileged
      registers itself without trapping to the host.  This gives excellent
      performance, but does mean that KVM cannot emulate a processor
      architecture other than the one that the hardware implements.
      
      This code assumes that the guest is running paravirtualized using the
      PAPR (Power Architecture Platform Requirements) interface, which is the
      interface that IBM's PowerVM hypervisor uses.  That means that existing
      Linux distributions that run on IBM pSeries machines will also run
      under KVM without modification.  In order to communicate the PAPR
      hypercalls to qemu, this adds a new KVM_EXIT_PAPR_HCALL exit code
      to include/linux/kvm.h.
      
      Currently the choice between book3s_hv support and book3s_pr support
      (i.e. the existing code, which runs the guest in user mode) has to be
      made at kernel configuration time, so a given kernel binary can only
      do one or the other.
      
      This new book3s_hv code doesn't support MMIO emulation at present.
      Since we are running paravirtualized guests, this isn't a serious
      restriction.
      
      With the guest running in supervisor mode, most exceptions go straight
      to the guest.  We will never get data or instruction storage or segment
      interrupts, alignment interrupts, decrementer interrupts, program
      interrupts, single-step interrupts, etc., coming to the hypervisor from
      the guest.  Therefore this introduces a new KVMTEST_NONHV macro for the
      exception entry path so that we don't have to do the KVM test on entry
      to those exception handlers.
      
      We do however get hypervisor decrementer, hypervisor data storage,
      hypervisor instruction storage, and hypervisor emulation assist
      interrupts, so we have to handle those.
      
      In hypervisor mode, real-mode accesses can access all of RAM, not just
      a limited amount.  Therefore we put all the guest state in the vcpu.arch
      and use the shadow_vcpu in the PACA only for temporary scratch space.
      We allocate the vcpu with kzalloc rather than vzalloc, and we don't use
      anything in the kvmppc_vcpu_book3s struct, so we don't allocate it.
      We don't have a shared page with the guest, but we still need a
      kvm_vcpu_arch_shared struct to store the values of various registers,
      so we include one in the vcpu_arch struct.
      
      The POWER7 processor has a restriction that all threads in a core have
      to be in the same partition.  MMU-on kernel code counts as a partition
      (partition 0), so we have to do a partition switch on every entry to and
      exit from the guest.  At present we require the host and guest to run
      in single-thread mode because of this hardware restriction.
      
      This code allocates a hashed page table for the guest and initializes
      it with HPTEs for the guest's Virtual Real Memory Area (VRMA).  We
      require that the guest memory is allocated using 16MB huge pages, in
      order to simplify the low-level memory management.  This also means that
      we can get away without tracking paging activity in the host for now,
      since huge pages can't be paged or swapped.
      
      This also adds a few new exports needed by the book3s_hv code.
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      de56a948
  21. 08 7月, 2011 1 次提交
  22. 29 6月, 2011 1 次提交
  23. 20 6月, 2011 1 次提交
  24. 26 5月, 2011 1 次提交
    • M
      powerpc/cell: Use common smp ipi actions · 7ef71d75
      Milton Miller 提交于
      The cell iic interrupt controller has enough software caused interrupts
      to use a unique interrupt for each of the 4 messages powerpc uses.
      This means each interrupt gets its own irq action/data combination.
      
      Use the seperate, optimized, arch common ipi action functions
      registered via the helper smp_request_message_ipi instead passing the
      message as action data to a single action that then demultipexes to
      the required acton via a switch statement.
      
      smp_request_message_ipi will register the action as IRQF_PER_CPU
      and IRQF_DISABLED, and WARN if the allocation fails for some reason,
      so no need to print on that failure.  It will return positive if
      the message will not be used by the kernel, in which case we can
      free the virq.
      
      In addition to elimiating inefficient code, this also corrects the
      error that a kernel built with kexec but without a debugger would
      not register the ipi for kdump to notify the other cpus of a crash.
      
      This also restores the debugger action to be static to kernel/smp.c.
      Signed-off-by: NMilton Miller <miltonm@bga.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      7ef71d75
  25. 19 5月, 2011 5 次提交
    • M
      powerpc: Use bytes instead of bitops in smp ipi multiplexing · 71454272
      Milton Miller 提交于
      Since there are only 4 messages, we can replace the atomic bit set
      (which uses atomic load reserve and store conditional sequence) with
      a byte stores to seperate bytes.  We still have to perform a load
      reserve and store conditional sequence to avoid loosing messages on
      reception but we can do that with a single call to xchg.
      
      The do {} while and __BIG_ENDIAN specific mask testing was chosen by
      looking at the generated asm code.  On gcc-4.4, the bit masking becomes
      a simple bit mask and test of the register returned from xchg without
      storing and loading the value to the stack like attempts with a union
      of bytes and an int (or worse, loading single bit constants from the
      constant pool into non-voliatle registers that had to be preseved on
      the stack).  The do {} while avoids an unconditional branch to the
      end of the loop to test the entry / repeat condition of a while loop
      and instead optimises for the expected single iteration of the loop.
      
      We have a full mb() at the beginning to cover ordering between send,
      ipi, and receive so we can use xchg_local and forgo the further
      acquire and release barriers of xchg.
      Signed-off-by: NMilton Miller <miltonm@bga.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      71454272
    • M
      powerpc: Add kconfig for muxed smp ipi support · 1ece355b
      Milton Miller 提交于
      Compile the new smp ipi mux and demux code only if a platform
      will make use of it.  The new config is selected as required.
      
      The new cause_ipi smp op is only available conditionally to point out
      configs where the select is required; this makes setting the op an
      immediate fail instead of a deferred unresolved symbol at link.
      
      This also creates a new config for power surge powermac upgrade support
      that can be disabled in expert mode but is default on.
      
      I also removed the depends / default y on CONFIG_XICS since it is selected
      by PSERIES.
      Signed-off-by: NMilton Miller <miltonm@bga.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      1ece355b
    • M
      powerpc: Consolidate ipi message mux and demux · 23d72bfd
      Milton Miller 提交于
      Consolidate the mux and demux of ipi messages into smp.c and call
      a new smp_ops callback to actually trigger the ipi.
      
      The powerpc architecture code is optimised for having 4 distinct
      ipi triggers, which are mapped to 4 distinct messages (ipi many, ipi
      single, scheduler ipi, and enter debugger).  However, several interrupt
      controllers only provide a single software triggered interrupt that
      can be delivered to each cpu.  To resolve this limitation, each smp_ops
      implementation created a per-cpu variable that is manipulated with atomic
      bitops.  Since these lines will be contended they are optimialy marked as
      shared_aligned and take a full cache line for each cpu.  Distro kernels
      may have 2 or 3 of these in their config, each taking per-cpu space
      even though at most one will be in use.
      
      This consolidation removes smp_message_recv and replaces the single call
      actions cases with direct calls from the common message recognition loop.
      The complicated debugger ipi case with its muxed crash handling code is
      moved to debug_ipi_action which is now called from the demux code (instead
      of the multi-message action calling smp_message_recv).
      
      I put a call to reschedule_action to increase the likelyhood of correctly
      merging the anticipated scheduler_ipi() hook coming from the scheduler
      tree; that single required call can be inlined later.
      
      The actual message decode is a copy of the old pseries xics code with its
      memory barriers and cache line spacing, augmented with a per-cpu unsigned
      long based on the book-e doorbell code.  The optional data is set via a
      callback from the implementation and is passed to the new cause-ipi hook
      along with the logical cpu number.  While currently only the doorbell
      implemntation uses this data it should be almost zero cost to retrieve and
      pass it -- it adds a single register load for the argument from the same
      cache line to which we just completed a store and the register is dead
      on return from the call.  I extended the data element from unsigned int
      to unsigned long in case some other code wanted to associate a pointer.
      
      The doorbell check_self is replaced by a call to smp_muxed_ipi_resend,
      conditioned on the CPU_DBELL feature.  The ifdef guard could be relaxed
      to CONFIG_SMP but I left it with BOOKE for now.
      
      Also, the doorbell interrupt vector for book-e was not calling irq_enter
      and irq_exit, which throws off cpu accounting and causes code to not
      realize it is running in interrupt context.  Add the missing calls.
      Signed-off-by: NMilton Miller <miltonm@bga.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      23d72bfd
    • M
      powerpc: Remove call sites of MSG_ALL_BUT_SELF · e0476371
      Milton Miller 提交于
      The only user of MSG_ALL_BUT_SELF in the whole kernel tree is powerpc,
      and it only uses it to start the debugger. Both debuggers always call
      smp_send_debugger_break with MSG_ALL_BUT_SELF, and only mpic can do
      anything more optimal than a loop over all online cpus, but all message
      passing implementations have to code for this special delivery target.
      
      Convert smp_send_debugger_break to take void and loop calling the smp_ops
      message_pass function for each of the other cpus in the online cpumask.
      
      Use raw_smp_processor_id() because we are either entering the debugger
      or trying to start kdump and the additional warning it not useful were
      it to trigger.
      Signed-off-by: NMilton Miller <miltonm@bga.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      e0476371
    • K
      powerpc/4xx: Fix regression in SMP on 476 · c560bbce
      kerstin jonsson 提交于
      commit c56e5853 breaks SMP support in PPC_47x chip.
       secondary_ti must be set to current thread info before callin kick_cpu or else
       start_secondary_47x will jump into void when trying to return to c-code.
       In the current setup secondary_ti is initialized before the CPU idle task is started
       and only the boot core will start. I am not sure this is the correct solution, but it
       makes SMP possible in my chip.
       Note! The HOTPLUG support probably need some fixing to, There is no trampoline code
       available in head_44x.S - start_secondary_resume?
      Signed-off-by: NKerstin Jonsson <kerstin.jonsson@ericsson.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      c560bbce
  26. 04 5月, 2011 1 次提交
  27. 20 4月, 2011 1 次提交
  28. 14 4月, 2011 1 次提交
  29. 01 4月, 2011 6 次提交