提交 · cce606feb425093c8371089d392e336d186e125b · openeuler / raspberrypi-kernel

01 7月, 2013 2 次提交

powerpc: Set cpu sibling mask before online cpu · cce606fe

由 Li Zhong 提交于 5月 16, 2013

It seems following race is possible:

	cpu0					cpux
smp_init->cpu_up->_cpu_up
	__cpu_up
		kick_cpu(1)
-------------------------------------------------------------------------
		waiting online			...
		...				notify CPU_STARTING
							set cpux active
						set cpux online
-------------------------------------------------------------------------
		finish waiting online
		...
sched_init_smp
	init_sched_domains(cpu_active_mask)
		build_sched_domains
						set cpux sibling info
-------------------------------------------------------------------------

Execution of cpu0 and cpux could be concurrent between two separator
lines.

So if the cpux sibling information was set too late (normally
impossible, but could be triggered by adding some delay in
start_secondary, after setting cpu online), build_sched_domains()
running on cpu0 might see cpux active, with an empty sibling mask, then
cause some bad address accessing like following:

[    0.099855] Unable to handle kernel paging request for data at address 0xc00000038518078f
[    0.099868] Faulting instruction address: 0xc0000000000b7a64
[    0.099883] Oops: Kernel access of bad area, sig: 11 [#1]
[    0.099895] PREEMPT SMP NR_CPUS=16 DEBUG_PAGEALLOC NUMA pSeries
[    0.099922] Modules linked in:
[    0.099940] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 3.10.0-rc1-00120-gb973425c-dirty #16
[    0.099956] task: c0000001fed80000 ti: c0000001fed7c000 task.ti: c0000001fed7c000
[    0.099971] NIP: c0000000000b7a64 LR: c0000000000b7a40 CTR: c0000000000b4934
[    0.099985] REGS: c0000001fed7f760 TRAP: 0300   Not tainted  (3.10.0-rc1-00120-gb973425c-dirty)
[    0.099997] MSR: 8000000000009032 <SF,EE,ME,IR,DR,RI>  CR: 24272828  XER: 20000003
[    0.100045] SOFTE: 1
[    0.100053] CFAR: c000000000445ee8
[    0.100064] DAR: c00000038518078f, DSISR: 40000000
[    0.100073]
GPR00: 0000000000000080 c0000001fed7f9e0 c000000000c84d48 0000000000000010
GPR04: 0000000000000010 0000000000000000 c0000001fc55e090 0000000000000000
GPR08: ffffffffffffffff c000000000b80b30 c000000000c962d8 00000003845ffc5f
GPR12: 0000000000000000 c00000000f33d000 c00000000000b9e4 0000000000000000
GPR16: 0000000000000000 0000000000000000 0000000000000001 0000000000000000
GPR20: c000000000ccf750 0000000000000000 c000000000c94d48 c0000001fc504000
GPR24: c0000001fc504000 c0000001fecef848 c000000000c94d48 c000000000ccf000
GPR28: c0000001fc522090 0000000000000010 c0000001fecef848 c0000001fed7fae0
[    0.100293] NIP [c0000000000b7a64] .get_group+0x84/0xc4
[    0.100307] LR [c0000000000b7a40] .get_group+0x60/0xc4
[    0.100318] Call Trace:
[    0.100332] [c0000001fed7f9e0] [c0000000000dbce4] .lock_is_held+0xa8/0xd0 (unreliable)
[    0.100354] [c0000001fed7fa70] [c0000000000bf62c] .build_sched_domains+0x728/0xd14
[    0.100375] [c0000001fed7fbe0] [c000000000af67bc] .sched_init_smp+0x4fc/0x654
[    0.100394] [c0000001fed7fce0] [c000000000adce24] .kernel_init_freeable+0x17c/0x30c
[    0.100413] [c0000001fed7fdb0] [c00000000000ba08] .kernel_init+0x24/0x12c
[    0.100431] [c0000001fed7fe30] [c000000000009f74] .ret_from_kernel_thread+0x5c/0x68
[    0.100445] Instruction dump:
[    0.100456] 38800010 38a00000 4838e3f5 60000000 7c6307b4 2fbf0000 419e0040 3d220001
[    0.100496] 78601f24 39491590 e93e0008 7d6a002a <7d69582a> f97f0000 7d4a002a e93e0010
[    0.100559] ---[ end trace 31fd0ba7d8756001 ]---

This patch tries to move the sibling maps updating before
notify_cpu_starting() and cpu online, and a write barrier there to make
sure sibling maps are updated before active and online mask.
Signed-off-by: NLi Zhong <zhong@linux.vnet.ibm.com>
Reviewed-by: NSrivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

cce606fe

powerpc: Delete __cpuinit usage from all users · 061d19f2

由 Paul Gortmaker 提交于 6月 24, 2013

The __cpuinit type of throwaway sections might have made sense
some time ago when RAM was more constrained, but now the savings
do not offset the cost and complications.  For example, the fix in
commit 5e427ec2 ("x86: Fix bit corruption at CPU resume time")
is a good example of the nasty type of bugs that can be created
with improper use of the various __init prefixes.

After a discussion on LKML[1] it was decided that cpuinit should go
the way of devinit and be phased out.  Once all the users are gone,
we can then finally remove the macros themselves from linux/init.h.

This removes all the powerpc uses of the __cpuinit macros.  There
are no __CPUINIT users in assembly files in powerpc.

[1] https://lkml.org/lkml/2013/5/20/589

Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Josh Boyer <jwboyer@gmail.com>
Cc: Matt Porter <mporter@kernel.crashing.org>
Cc: Kumar Gala <galak@kernel.crashing.org>
Cc: linuxppc-dev@lists.ozlabs.org
Signed-off-by: NPaul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

061d19f2

08 4月, 2013 1 次提交

powerpc: Use generic idle loop · 799fef06

由 Thomas Gleixner 提交于 3月 21, 2013

Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Paul McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Reviewed-by: NCc: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Cc: Magnus Damm <magnus.damm@gmail.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Link: http://lkml.kernel.org/r/20130321215235.026838003@linutronix.deSigned-off-by: NThomas Gleixner <tglx@linutronix.de>

799fef06

08 2月, 2013 1 次提交

powerpc: fix ics_rtas_init and start_secondary section mismatch · 174ea471

由 Daniel Borkmann 提交于 2月 05, 2013

It seems, we're fine with just annotating the two functions.
Thus, this fixes the following build warnings on ppc64:

WARNING: arch/powerpc/sysdev/xics/built-in.o(.text+0x1664):
The function .ics_rtas_init() references
the function __init .xics_register_ics().
This is often because .ics_rtas_init lacks a __init
annotation or the annotation of .xics_register_ics is wrong.

WARNING: arch/powerpc/sysdev/built-in.o(.text+0x6044):
The function .ics_rtas_init() references
the function __init .xics_register_ics().
This is often because .ics_rtas_init lacks a __init
annotation or the annotation of .xics_register_ics is wrong.

WARNING: arch/powerpc/kernel/built-in.o(.text+0x2db30):
The function .start_secondary() references
the function __cpuinit .vdso_getcpu_init().
This is often because .start_secondary lacks a __cpuinit
annotation or the annotation of .vdso_getcpu_init is wrong.

Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: NDaniel Borkmann <dborkman@redhat.com>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

174ea471

04 1月, 2013 1 次提交

POWERPC: drivers: remove __dev* attributes. · cad5cef6

由 Greg Kroah-Hartman 提交于 12月 21, 2012

CONFIG_HOTPLUG is going away as an option.  As a result, the __dev*
markings need to be removed.

This change removes the use of __devinit, __devexit_p, __devinitdata,
__devinitconst, and __devexit from these drivers.

Based on patches originally written by Bill Pemberton, but redone by me
in order to handle some of the coding style issues better, by hand.

Cc: Bill Pemberton <wfp5p@virginia.edu>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

cad5cef6

30 10月, 2012 1 次提交

KVM: PPC: Book3S HV: Allow KVM guests to stop secondary threads coming online · 512691d4

由 Paul Mackerras 提交于 10月 15, 2012

When a Book3S HV KVM guest is running, we need the host to be in
single-thread mode, that is, all of the cores (or at least all of
the cores where the KVM guest could run) to be running only one
active hardware thread. This is because of the hardware restriction
in POWER processors that all of the hardware threads in the core
must be in the same logical partition. Complying with this restriction
is much easier if, from the host kernel's point of view, only one
hardware thread is active.

This adds two hooks in the SMP hotplug code to allow the KVM code to
make sure that secondary threads (i.e. hardware threads other than
thread 0) cannot come online while any KVM guest exists. The KVM
code still has to check that any core where it runs a guest has the
secondary threads offline, but having done that check it can now be
sure that they will not come online while the guest is running.
Signed-off-by: NPaul Mackerras <paulus@samba.org>
Acked-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: NAlexander Graf <agraf@suse.de>

512691d4

19 9月, 2012 1 次提交

powerpc/smp: Do not disable IPI interrupts during suspend · e6651de9

由 Zhao Chenhui 提交于 7月 20, 2012

During suspend, all interrupts including IPI will be disabled. In this case,
the suspend process will hang in SMP. To prevent this, pass the flag
IRQF_NO_SUSPEND when requesting IPI irq.
Signed-off-by: NZhao Chenhui <chenhui.zhao@freescale.com>
Signed-off-by: NLi Yang <leoli@freescale.com>
Acked-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: NKumar Gala <galak@kernel.crashing.org>

e6651de9

13 9月, 2012 1 次提交

powerpc/smp: add generic_set_cpu_up() to set cpu_state as CPU_UP_PREPARE · ae5cab47

由 Zhao Chenhui 提交于 7月 20, 2012

In the case of cpu hotplug, the cpu_state should be set to CPU_UP_PREPARE
when kicking cpu.  Otherwise, the cpu_state is always CPU_DEAD after
calling generic_set_cpu_dead(), which makes the delay in generic_cpu_die()
not happen.
Signed-off-by: NZhao Chenhui <chenhui.zhao@freescale.com>
Signed-off-by: NKumar Gala <galak@kernel.crashing.org>

ae5cab47

05 9月, 2012 1 次提交

powerpc: Make sure IPI handlers see data written by IPI senders · 9fb1b36c

由 Paul Mackerras 提交于 9月 04, 2012

We have been observing hangs, both of KVM guest vcpu tasks and more
generally, where a process that is woken doesn't properly wake up and
continue to run, but instead sticks in TASK_WAKING state.  This
happens because the update of rq->wake_list in ttwu_queue_remote()
is not ordered with the update of ipi_message in
smp_muxed_ipi_message_pass(), and the reading of rq->wake_list in
scheduler_ipi() is not ordered with the reading of ipi_message in
smp_ipi_demux().  Thus it is possible for the IPI receiver not to see
the updated rq->wake_list and therefore conclude that there is nothing
for it to do.

In order to make sure that anything done before smp_send_reschedule()
is ordered before anything done in the resulting call to scheduler_ipi(),
this adds barriers in smp_muxed_message_pass() and smp_ipi_demux().
The barrier in smp_muxed_message_pass() is a full barrier to ensure that
there is a full ordering between the smp_send_reschedule() caller and
scheduler_ipi().  In smp_ipi_demux(), we use xchg() rather than
xchg_local() because xchg() includes release and acquire barriers.
Using xchg() rather than xchg_local() makes sense given that
ipi_message is not just accessed locally.

This moves the barrier between setting the message and calling the
cause_ipi() function into the individual cause_ipi implementations.
Most of them -- those that used outb, out_8 or similar -- already had
a full barrier because out_8 etc. include a sync before the MMIO
store.  This adds an explicit barrier in the two remaining cases.

These changes made no measurable difference to the speed of IPIs as
measured using a simple ping-pong latency test across two CPUs on
different cores of a POWER7 machine.

The analysis of the reason why processes were not waking up properly
is due to Milton Miller.

Cc: stable@vger.kernel.org # v3.0+
Reported-by: NMilton Miller <miltonm@bga.com>
Signed-off-by: NPaul Mackerras <paulus@samba.org>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

9fb1b36c

11 7月, 2012 1 次提交

powerpc: Add VDSO version of getcpu · 18ad51dd

由 Anton Blanchard 提交于 7月 04, 2012

We have a request for a fast method of getting CPU and NUMA node IDs
from userspace. This patch implements a getcpu VDSO function,
similar to x86.

Ben suggested we use SPRG3 which is userspace readable. SPRG3 can be
modified by a KVM guest, so we save the SPRG3 value in the paca and
restore it when transitioning from the guest to the host.

I have a glibc patch that implements sched_getcpu on top of this.
Testing on a POWER7:

baseline: 538 cycles
vdso:      30 cycles
Signed-off-by: NAnton Blanchard <anton@samba.org>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

18ad51dd

03 7月, 2012 1 次提交

powerpc/smp: remove call to ipi_call_lock()/ipi_call_unlock() · e250d4bc

由 Yong Zhang 提交于 5月 28, 2012

1) call_function.lock used in smp_call_function_many() is just to protect
   call_function.queue and &data->refs, cpu_online_mask is outside of the
   lock. And it's not necessary to protect cpu_online_mask,
   because data->cpumask is pre-calculate and even if a cpu is brougt up
   when calling arch_send_call_function_ipi_mask(), it's harmless because
   validation test in generic_smp_call_function_interrupt() will take care
   of it.

2) For cpu down issue, stop_machine() will guarantee that no concurrent
   smp_call_fuction() is processing.
Signed-off-by: NYong Zhang <yong.zhang0@gmail.com>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

e250d4bc

05 6月, 2012 1 次提交

POWERPC: Smp: remove call to ipi_call_lock()/ipi_call_unlock() · 72ec61a9

由 Yong Zhang 提交于 5月 29, 2012

ipi_call_lock/unlock() lock resp. unlock call_function.lock. This lock
protects only the call_function data structure itself, but it's
completely unrelated to cpu_online_mask. The mask to which the IPIs
are sent is calculated before call_function.lock is taken in
smp_call_function_many(), so the locking around set_cpu_online() is
pointless and can be removed.

[ tglx: Massaged changelog ]
Signed-off-by: NYong Zhang <yong.zhang0@gmail.com>
Cc: ralf@linux-mips.org
Cc: sshtylyov@mvista.com
Cc: david.daney@cavium.com
Cc: nikunj@linux.vnet.ibm.com
Cc: paulmck@linux.vnet.ibm.com
Cc: axboe@kernel.dk
Cc: peterz@infradead.org
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: linuxppc-dev@lists.ozlabs.org
Link: http://lkml.kernel.org/r/1338275765-3217-10-git-send-email-yong.zhang0@gmail.comAcked-by: NSrivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Acked-by: NPeter Zijlstra <peterz@infradead.org>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>

72ec61a9

26 4月, 2012 2 次提交

powerpc: Use generic idle thread allocation · 17e32eac

由 Thomas Gleixner 提交于 4月 20, 2012

Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Link: http://lkml.kernel.org/r/20120420124557.311212868@linutronix.de

17e32eac

smp: Add task_struct argument to __cpu_up() · 8239c25f

由 Thomas Gleixner 提交于 4月 20, 2012

Preparatory patch to make the idle thread allocation for secondary
cpus generic.
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Mike Frysinger <vapier@gentoo.org>
Cc: Jesper Nilsson <jesper.nilsson@axis.com>
Cc: Richard Kuo <rkuo@codeaurora.org>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Hirokazu Takata <takata@linux-m32r.org>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: David Howells <dhowells@redhat.com>
Cc: James E.J. Bottomley <jejb@parisc-linux.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: David S. Miller <davem@davemloft.net>
Cc: Chris Metcalf <cmetcalf@tilera.com>
Cc: Richard Weinberger <richard@nod.at>
Cc: x86@kernel.org
Link: http://lkml.kernel.org/r/20120420124556.964170564@linutronix.de

8239c25f

29 3月, 2012 1 次提交

Disintegrate asm/system.h for PowerPC · ae3a197e

由 David Howells 提交于 3月 28, 2012

Disintegrate asm/system.h for PowerPC.
Signed-off-by: NDavid Howells <dhowells@redhat.com>
Acked-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
cc: linuxppc-dev@lists.ozlabs.org

ae3a197e

22 12月, 2011 1 次提交

cpu: convert 'cpu' and 'machinecheck' sysdev_class to a regular subsystem · 8a25a2fd

由 Kay Sievers 提交于 12月 21, 2011

This moves the 'cpu sysdev_class' over to a regular 'cpu' subsystem
and converts the devices to regular devices. The sysdev drivers are
implemented as subsystem interfaces now.

After all sysdev classes are ported to regular driver core entities, the
sysdev implementation will be entirely removed from the kernel.

Userspace relies on events and generic sysfs subsystem infrastructure
from sysdev devices, which are made available with this conversion.

Cc: Haavard Skinnemoen <hskinnemoen@gmail.com>
Cc: Hans-Christian Egtvedt <egtvedt@samfundet.no>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Chris Metcalf <cmetcalf@tilera.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Borislav Petkov <bp@amd64.org>
Cc: Tigran Aivazian <tigran@aivazian.fsnet.co.uk>
Cc: Len Brown <lenb@kernel.org>
Cc: Zhang Rui <rui.zhang@intel.com>
Cc: Dave Jones <davej@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Russell King <rmk+kernel@arm.linux.org.uk>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
Cc: "Srivatsa S. Bhat" <srivatsa.bhat@linux.vnet.ibm.com>
Signed-off-by: NKay Sievers <kay.sievers@vrfy.org>
Signed-off-by: NGreg Kroah-Hartman <gregkh@suse.de>

8a25a2fd

25 11月, 2011 1 次提交

powerpc: Mark IPI interrupts IRQF_NO_THREAD · 3b5e16d7

由 Thomas Gleixner 提交于 10月 05, 2011

IPI handlers cannot be threaded. Remove the obsolete IRQF_DISABLED
flag (see commit e58aa3d2) while at it.
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

3b5e16d7

08 11月, 2011 1 次提交

powerpc/irq: Remove IRQF_DISABLED · a3a9f3b4

由 Yong Zhang 提交于 10月 21, 2011

Since commit [e58aa3d2: genirq: Run irq handlers with interrupts disabled],
We run all interrupt handlers with interrupts disabled
and we even check and yell when an interrupt handler
returns with interrupts enabled (see commit [b738a50a:
genirq: Warn when handler enables interrupts]).

So now this flag is a NOOP and can be removed.
Signed-off-by: NYong Zhang <yong.zhang0@gmail.com>
Acked-by: NArnd Bergmann <arnd@arndb.de>
Acked-by: NGeoff Levand <geoff@infradead.org>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

a3a9f3b4

01 11月, 2011 1 次提交

powerpc: various straight conversions from module.h --> export.h · 4b16f8e2

由 Paul Gortmaker 提交于 7月 22, 2011

All these files were including module.h just for the basic
EXPORT_SYMBOL infrastructure.  We can shift them off to the
export.h header which is a way smaller footprint and thus
realize some compile time gains.
Signed-off-by: NPaul Gortmaker <paul.gortmaker@windriver.com>

4b16f8e2

20 9月, 2011 1 次提交

powerpc/smp: More generic support for "soft hotplug" · fb82b839

由 Benjamin Herrenschmidt 提交于 9月 19, 2011

This adds more generic support for doing CPU hotplug with a simple
idle loop and no actual reset of the processors. The generic
smp_generic_kick_cpu() does the hotplug bringup trick if the PACA
shows that the CPU has already been started at boot and we provide
an accessor for the CPU state.
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

fb82b839

27 7月, 2011 1 次提交

atomic: use <linux/atomic.h> · 60063497

由 Arun Sharma 提交于 7月 26, 2011

This allows us to move duplicated code in <asm/atomic.h>
(atomic_inc_not_zero() for now) to <linux/atomic.h>
Signed-off-by: NArun Sharma <asharma@fb.com>
Reviewed-by: NEric Dumazet <eric.dumazet@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: David Miller <davem@davemloft.net>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Acked-by: NMike Frysinger <vapier@gentoo.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

60063497

12 7月, 2011 1 次提交

KVM: PPC: Add support for Book3S processors in hypervisor mode · de56a948

由 Paul Mackerras 提交于 6月 29, 2011

This adds support for KVM running on 64-bit Book 3S processors,
specifically POWER7, in hypervisor mode. Using hypervisor mode means
that the guest can use the processor's supervisor mode. That means
that the guest can execute privileged instructions and access privileged
registers itself without trapping to the host. This gives excellent
performance, but does mean that KVM cannot emulate a processor
architecture other than the one that the hardware implements.

This code assumes that the guest is running paravirtualized using the
PAPR (Power Architecture Platform Requirements) interface, which is the
interface that IBM's PowerVM hypervisor uses. That means that existing
Linux distributions that run on IBM pSeries machines will also run
under KVM without modification. In order to communicate the PAPR
hypercalls to qemu, this adds a new KVM_EXIT_PAPR_HCALL exit code
to include/linux/kvm.h.

Currently the choice between book3s_hv support and book3s_pr support
(i.e. the existing code, which runs the guest in user mode) has to be
made at kernel configuration time, so a given kernel binary can only
do one or the other.

This new book3s_hv code doesn't support MMIO emulation at present.
Since we are running paravirtualized guests, this isn't a serious
restriction.

With the guest running in supervisor mode, most exceptions go straight
to the guest. We will never get data or instruction storage or segment
interrupts, alignment interrupts, decrementer interrupts, program
interrupts, single-step interrupts, etc., coming to the hypervisor from
the guest. Therefore this introduces a new KVMTEST_NONHV macro for the
exception entry path so that we don't have to do the KVM test on entry
to those exception handlers.

We do however get hypervisor decrementer, hypervisor data storage,
hypervisor instruction storage, and hypervisor emulation assist
interrupts, so we have to handle those.

In hypervisor mode, real-mode accesses can access all of RAM, not just
a limited amount. Therefore we put all the guest state in the vcpu.arch
and use the shadow_vcpu in the PACA only for temporary scratch space.
We allocate the vcpu with kzalloc rather than vzalloc, and we don't use
anything in the kvmppc_vcpu_book3s struct, so we don't allocate it.
We don't have a shared page with the guest, but we still need a
kvm_vcpu_arch_shared struct to store the values of various registers,
so we include one in the vcpu_arch struct.

The POWER7 processor has a restriction that all threads in a core have
to be in the same partition. MMU-on kernel code counts as a partition
(partition 0), so we have to do a partition switch on every entry to and
exit from the guest. At present we require the host and guest to run
in single-thread mode because of this hardware restriction.

This code allocates a hashed page table for the guest and initializes
it with HPTEs for the guest's Virtual Real Memory Area (VRMA). We
require that the guest memory is allocated using 16MB huge pages, in
order to simplify the low-level memory management. This also means that
we can get away without tracking paging activity in the host for now,
since huge pages can't be paged or swapped.

This also adds a few new exports needed by the book3s_hv code.
Signed-off-by: NPaul Mackerras <paulus@samba.org>
Signed-off-by: NAlexander Graf <agraf@suse.de>

de56a948

08 7月, 2011 1 次提交

powerpc: Create next_tlbcam_idx percpu variable for FSL_BOOKE · 3160b097

由 Becky Bruce 提交于 6月 28, 2011

This is used to round-robin TLBCAM entries.
Signed-off-by: NBecky Bruce <beckyb@kernel.crashing.org>
Signed-off-by: NKumar Gala <galak@kernel.crashing.org>

3160b097

29 6月, 2011 1 次提交

powerpc/book3e-64: Reraise doorbell when masked by soft-irq-disable · 3d97a619

由 Scott Wood 提交于 6月 22, 2011

Signed-off-by: NScott Wood <scottwood@freescale.com>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

3d97a619

20 6月, 2011 1 次提交

powerpc: Avoid extra indirect function call in sending IPIs · 9ca980dc

由 Paul Mackerras 提交于 5月 25, 2011

On many platforms (including pSeries), smp_ops->message_pass is always
smp_muxed_ipi_message_pass.  This changes arch/powerpc/kernel/smp.c so
that if smp_ops->message_pass is NULL, it calls smp_muxed_ipi_message_pass
directly.

This means that a platform doesn't need to set both .message_pass and
.cause_ipi, only one of them.  It is a slight performance improvement
in that it gets rid of an indirect function call at the expense of a
predictable conditional branch.
Signed-off-by: NPaul Mackerras <paulus@samba.org>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

9ca980dc

26 5月, 2011 1 次提交

powerpc/cell: Use common smp ipi actions · 7ef71d75

由 Milton Miller 提交于 5月 24, 2011

The cell iic interrupt controller has enough software caused interrupts
to use a unique interrupt for each of the 4 messages powerpc uses.
This means each interrupt gets its own irq action/data combination.

Use the seperate, optimized, arch common ipi action functions
registered via the helper smp_request_message_ipi instead passing the
message as action data to a single action that then demultipexes to
the required acton via a switch statement.

smp_request_message_ipi will register the action as IRQF_PER_CPU
and IRQF_DISABLED, and WARN if the allocation fails for some reason,
so no need to print on that failure.  It will return positive if
the message will not be used by the kernel, in which case we can
free the virq.

In addition to elimiating inefficient code, this also corrects the
error that a kernel built with kexec but without a debugger would
not register the ipi for kdump to notify the other cpus of a crash.

This also restores the debugger action to be static to kernel/smp.c.
Signed-off-by: NMilton Miller <miltonm@bga.com>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

7ef71d75

19 5月, 2011 5 次提交

powerpc: Use bytes instead of bitops in smp ipi multiplexing · 71454272

由 Milton Miller 提交于 5月 10, 2011

Since there are only 4 messages, we can replace the atomic bit set
(which uses atomic load reserve and store conditional sequence) with
a byte stores to seperate bytes.  We still have to perform a load
reserve and store conditional sequence to avoid loosing messages on
reception but we can do that with a single call to xchg.

The do {} while and __BIG_ENDIAN specific mask testing was chosen by
looking at the generated asm code.  On gcc-4.4, the bit masking becomes
a simple bit mask and test of the register returned from xchg without
storing and loading the value to the stack like attempts with a union
of bytes and an int (or worse, loading single bit constants from the
constant pool into non-voliatle registers that had to be preseved on
the stack).  The do {} while avoids an unconditional branch to the
end of the loop to test the entry / repeat condition of a while loop
and instead optimises for the expected single iteration of the loop.

We have a full mb() at the beginning to cover ordering between send,
ipi, and receive so we can use xchg_local and forgo the further
acquire and release barriers of xchg.
Signed-off-by: NMilton Miller <miltonm@bga.com>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

71454272

powerpc: Add kconfig for muxed smp ipi support · 1ece355b

由 Milton Miller 提交于 5月 10, 2011

Compile the new smp ipi mux and demux code only if a platform
will make use of it.  The new config is selected as required.

The new cause_ipi smp op is only available conditionally to point out
configs where the select is required; this makes setting the op an
immediate fail instead of a deferred unresolved symbol at link.

This also creates a new config for power surge powermac upgrade support
that can be disabled in expert mode but is default on.

I also removed the depends / default y on CONFIG_XICS since it is selected
by PSERIES.
Signed-off-by: NMilton Miller <miltonm@bga.com>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

1ece355b

powerpc: Consolidate ipi message mux and demux · 23d72bfd

由 Milton Miller 提交于 5月 10, 2011

Consolidate the mux and demux of ipi messages into smp.c and call
a new smp_ops callback to actually trigger the ipi.

The powerpc architecture code is optimised for having 4 distinct
ipi triggers, which are mapped to 4 distinct messages (ipi many, ipi
single, scheduler ipi, and enter debugger). However, several interrupt
controllers only provide a single software triggered interrupt that
can be delivered to each cpu. To resolve this limitation, each smp_ops
implementation created a per-cpu variable that is manipulated with atomic
bitops. Since these lines will be contended they are optimialy marked as
shared_aligned and take a full cache line for each cpu. Distro kernels
may have 2 or 3 of these in their config, each taking per-cpu space
even though at most one will be in use.

This consolidation removes smp_message_recv and replaces the single call
actions cases with direct calls from the common message recognition loop.
The complicated debugger ipi case with its muxed crash handling code is
moved to debug_ipi_action which is now called from the demux code (instead
of the multi-message action calling smp_message_recv).

I put a call to reschedule_action to increase the likelyhood of correctly
merging the anticipated scheduler_ipi() hook coming from the scheduler
tree; that single required call can be inlined later.

The actual message decode is a copy of the old pseries xics code with its
memory barriers and cache line spacing, augmented with a per-cpu unsigned
long based on the book-e doorbell code. The optional data is set via a
callback from the implementation and is passed to the new cause-ipi hook
along with the logical cpu number. While currently only the doorbell
implemntation uses this data it should be almost zero cost to retrieve and
pass it -- it adds a single register load for the argument from the same
cache line to which we just completed a store and the register is dead
on return from the call. I extended the data element from unsigned int
to unsigned long in case some other code wanted to associate a pointer.

The doorbell check_self is replaced by a call to smp_muxed_ipi_resend,
conditioned on the CPU_DBELL feature. The ifdef guard could be relaxed
to CONFIG_SMP but I left it with BOOKE for now.

Also, the doorbell interrupt vector for book-e was not calling irq_enter
and irq_exit, which throws off cpu accounting and causes code to not
realize it is running in interrupt context. Add the missing calls.
Signed-off-by: NMilton Miller <miltonm@bga.com>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

23d72bfd

powerpc: Remove call sites of MSG_ALL_BUT_SELF · e0476371

由 Milton Miller 提交于 5月 10, 2011

The only user of MSG_ALL_BUT_SELF in the whole kernel tree is powerpc,
and it only uses it to start the debugger. Both debuggers always call
smp_send_debugger_break with MSG_ALL_BUT_SELF, and only mpic can do
anything more optimal than a loop over all online cpus, but all message
passing implementations have to code for this special delivery target.

Convert smp_send_debugger_break to take void and loop calling the smp_ops
message_pass function for each of the other cpus in the online cpumask.

Use raw_smp_processor_id() because we are either entering the debugger
or trying to start kdump and the additional warning it not useful were
it to trigger.
Signed-off-by: NMilton Miller <miltonm@bga.com>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

e0476371

powerpc/4xx: Fix regression in SMP on 476 · c560bbce

由 kerstin jonsson 提交于 5月 17, 2011

commit c56e5853 breaks SMP support in PPC_47x chip.
secondary_ti must be set to current thread info before callin kick_cpu or else
start_secondary_47x will jump into void when trying to return to c-code.
In the current setup secondary_ti is initialized before the CPU idle task is started
and only the boot core will start. I am not sure this is the correct solution, but it
makes SMP possible in my chip.
Note! The HOTPLUG support probably need some fixing to, There is no trampoline code
available in head_44x.S - start_secondary_resume?
Signed-off-by: NKerstin Jonsson <kerstin.jonsson@ericsson.com>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

c560bbce

04 5月, 2011 1 次提交

powerpc: Convert old cpumask API into new one · 104699c0

由 KOSAKI Motohiro 提交于 4月 28, 2011

Adapt new API.

Almost change is trivial. Most important change is the below line
because we plan to change task->cpus_allowed implementation.

-       ctx->cpus_allowed = current->cpus_allowed;
Signed-off-by: NKOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

104699c0

20 4月, 2011 1 次提交

powerpc/smp: smp_ops->kick_cpu() should be able to fail · de300974

由 Michael Ellerman 提交于 4月 11, 2011

When we start a cpu we use smp_ops->kick_cpu(), which currently
returns void, it should be able to fail. Convert it to return
int, and update all uses.

Convert all the current error cases to return -ENOENT, which is
what would eventually be returned by __cpu_up() currently when
it doesn't detect the cpu as coming up in time.
Signed-off-by: NMichael Ellerman <michael@ellerman.id.au>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

de300974

14 4月, 2011 1 次提交

sched: Provide scheduler_ipi() callback in response to smp_send_reschedule() · 184748cc

由 Peter Zijlstra 提交于 4月 05, 2011

For future rework of try_to_wake_up() we'd like to push part of that
function onto the CPU the task is actually going to run on.

In order to do so we need a generic callback from the existing scheduler IPI.

This patch introduces such a generic callback: scheduler_ipi() and
implements it as a NOP.

BenH notes: PowerPC might use this IPI on offline CPUs under rare conditions!
Acked-by: NRussell King <rmk+kernel@arm.linux.org.uk>
Acked-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
Acked-by: NChris Metcalf <cmetcalf@tilera.com>
Acked-by: NJesper Nilsson <jesper.nilsson@axis.com>
Acked-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: NRalf Baechle <ralf@linux-mips.org>
Reviewed-by: NFrank Rowand <frank.rowand@am.sony.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Nick Piggin <npiggin@kernel.dk>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: NIngo Molnar <mingo@elte.hu>
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/20110405152728.744338123@chello.nl

184748cc

01 4月, 2011 6 次提交

B
powerpc/smp: Increase vdso_data->processorCount, not just decrease it · aeeafbfa
由 Benjamin Herrenschmidt 提交于 3月 08, 2011
```
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
```
aeeafbfa

powerpc/smp: Create idle threads on demand and properly reset them · c56e5853

由 Benjamin Herrenschmidt 提交于 3月 08, 2011

Instead of creating idle threads at boot for all possible CPUs, we
create them on demand, like x86 or ARM, and we properly call init_idle
to re-initialize an idle thread when a CPU was unplugged and is now
re-plugged.
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

c56e5853

powerpc/smp: Don't expose per-cpu "cpu_state" array · 105765f4

由 Benjamin Herrenschmidt 提交于 4月 01, 2011

Instead, keep it static, expose an accessor and use that from
the PowerMac code. Avoids easy namespace collisions and will
make it easier to consolidate with other implementations.
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

105765f4

powerpc/smp: Add a smp_ops->bringup_up() done callback · d7294445

由 Benjamin Herrenschmidt 提交于 3月 08, 2011

This allows us to stop abusing smp_ops->setup_cpu() for cleanup
tasks that have to take place after the initial boot time CPU
bringup.
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

d7294445

B
powerpc/pmac/smp: Rename fixup_irqs() to migrate_irqs() and use it on ppc32 · 1c91cc57
由 Benjamin Herrenschmidt 提交于 2月 11, 2011
```
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
```
1c91cc57

powerpc/smp: Remove unused smp_ops->cpu_enable() · 7a53a4fe

由 Benjamin Herrenschmidt 提交于 2月 11, 2011

Remove the last remnants of cpu_enable(), everybody uses the normal
__cpu_up() path now
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

7a53a4fe