- 01 7月, 2013 2 次提交
-
-
由 Li Zhong 提交于
It seems following race is possible: cpu0 cpux smp_init->cpu_up->_cpu_up __cpu_up kick_cpu(1) ------------------------------------------------------------------------- waiting online ... ... notify CPU_STARTING set cpux active set cpux online ------------------------------------------------------------------------- finish waiting online ... sched_init_smp init_sched_domains(cpu_active_mask) build_sched_domains set cpux sibling info ------------------------------------------------------------------------- Execution of cpu0 and cpux could be concurrent between two separator lines. So if the cpux sibling information was set too late (normally impossible, but could be triggered by adding some delay in start_secondary, after setting cpu online), build_sched_domains() running on cpu0 might see cpux active, with an empty sibling mask, then cause some bad address accessing like following: [ 0.099855] Unable to handle kernel paging request for data at address 0xc00000038518078f [ 0.099868] Faulting instruction address: 0xc0000000000b7a64 [ 0.099883] Oops: Kernel access of bad area, sig: 11 [#1] [ 0.099895] PREEMPT SMP NR_CPUS=16 DEBUG_PAGEALLOC NUMA pSeries [ 0.099922] Modules linked in: [ 0.099940] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 3.10.0-rc1-00120-gb973425c-dirty #16 [ 0.099956] task: c0000001fed80000 ti: c0000001fed7c000 task.ti: c0000001fed7c000 [ 0.099971] NIP: c0000000000b7a64 LR: c0000000000b7a40 CTR: c0000000000b4934 [ 0.099985] REGS: c0000001fed7f760 TRAP: 0300 Not tainted (3.10.0-rc1-00120-gb973425c-dirty) [ 0.099997] MSR: 8000000000009032 <SF,EE,ME,IR,DR,RI> CR: 24272828 XER: 20000003 [ 0.100045] SOFTE: 1 [ 0.100053] CFAR: c000000000445ee8 [ 0.100064] DAR: c00000038518078f, DSISR: 40000000 [ 0.100073] GPR00: 0000000000000080 c0000001fed7f9e0 c000000000c84d48 0000000000000010 GPR04: 0000000000000010 0000000000000000 c0000001fc55e090 0000000000000000 GPR08: ffffffffffffffff c000000000b80b30 c000000000c962d8 00000003845ffc5f GPR12: 0000000000000000 c00000000f33d000 c00000000000b9e4 0000000000000000 GPR16: 0000000000000000 0000000000000000 0000000000000001 0000000000000000 GPR20: c000000000ccf750 0000000000000000 c000000000c94d48 c0000001fc504000 GPR24: c0000001fc504000 c0000001fecef848 c000000000c94d48 c000000000ccf000 GPR28: c0000001fc522090 0000000000000010 c0000001fecef848 c0000001fed7fae0 [ 0.100293] NIP [c0000000000b7a64] .get_group+0x84/0xc4 [ 0.100307] LR [c0000000000b7a40] .get_group+0x60/0xc4 [ 0.100318] Call Trace: [ 0.100332] [c0000001fed7f9e0] [c0000000000dbce4] .lock_is_held+0xa8/0xd0 (unreliable) [ 0.100354] [c0000001fed7fa70] [c0000000000bf62c] .build_sched_domains+0x728/0xd14 [ 0.100375] [c0000001fed7fbe0] [c000000000af67bc] .sched_init_smp+0x4fc/0x654 [ 0.100394] [c0000001fed7fce0] [c000000000adce24] .kernel_init_freeable+0x17c/0x30c [ 0.100413] [c0000001fed7fdb0] [c00000000000ba08] .kernel_init+0x24/0x12c [ 0.100431] [c0000001fed7fe30] [c000000000009f74] .ret_from_kernel_thread+0x5c/0x68 [ 0.100445] Instruction dump: [ 0.100456] 38800010 38a00000 4838e3f5 60000000 7c6307b4 2fbf0000 419e0040 3d220001 [ 0.100496] 78601f24 39491590 e93e0008 7d6a002a <7d69582a> f97f0000 7d4a002a e93e0010 [ 0.100559] ---[ end trace 31fd0ba7d8756001 ]--- This patch tries to move the sibling maps updating before notify_cpu_starting() and cpu online, and a write barrier there to make sure sibling maps are updated before active and online mask. Signed-off-by: NLi Zhong <zhong@linux.vnet.ibm.com> Reviewed-by: NSrivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com> Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
由 Paul Gortmaker 提交于
The __cpuinit type of throwaway sections might have made sense some time ago when RAM was more constrained, but now the savings do not offset the cost and complications. For example, the fix in commit 5e427ec2 ("x86: Fix bit corruption at CPU resume time") is a good example of the nasty type of bugs that can be created with improper use of the various __init prefixes. After a discussion on LKML[1] it was decided that cpuinit should go the way of devinit and be phased out. Once all the users are gone, we can then finally remove the macros themselves from linux/init.h. This removes all the powerpc uses of the __cpuinit macros. There are no __CPUINIT users in assembly files in powerpc. [1] https://lkml.org/lkml/2013/5/20/589 Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Josh Boyer <jwboyer@gmail.com> Cc: Matt Porter <mporter@kernel.crashing.org> Cc: Kumar Gala <galak@kernel.crashing.org> Cc: linuxppc-dev@lists.ozlabs.org Signed-off-by: NPaul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
- 08 4月, 2013 1 次提交
-
-
由 Thomas Gleixner 提交于
Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Paul McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Reviewed-by: NCc: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com> Cc: Magnus Damm <magnus.damm@gmail.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Link: http://lkml.kernel.org/r/20130321215235.026838003@linutronix.deSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
-
- 08 2月, 2013 1 次提交
-
-
由 Daniel Borkmann 提交于
It seems, we're fine with just annotating the two functions. Thus, this fixes the following build warnings on ppc64: WARNING: arch/powerpc/sysdev/xics/built-in.o(.text+0x1664): The function .ics_rtas_init() references the function __init .xics_register_ics(). This is often because .ics_rtas_init lacks a __init annotation or the annotation of .xics_register_ics is wrong. WARNING: arch/powerpc/sysdev/built-in.o(.text+0x6044): The function .ics_rtas_init() references the function __init .xics_register_ics(). This is often because .ics_rtas_init lacks a __init annotation or the annotation of .xics_register_ics is wrong. WARNING: arch/powerpc/kernel/built-in.o(.text+0x2db30): The function .start_secondary() references the function __cpuinit .vdso_getcpu_init(). This is often because .start_secondary lacks a __cpuinit annotation or the annotation of .vdso_getcpu_init is wrong. Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: NDaniel Borkmann <dborkman@redhat.com> Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
- 04 1月, 2013 1 次提交
-
-
由 Greg Kroah-Hartman 提交于
CONFIG_HOTPLUG is going away as an option. As a result, the __dev* markings need to be removed. This change removes the use of __devinit, __devexit_p, __devinitdata, __devinitconst, and __devexit from these drivers. Based on patches originally written by Bill Pemberton, but redone by me in order to handle some of the coding style issues better, by hand. Cc: Bill Pemberton <wfp5p@virginia.edu> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
- 30 10月, 2012 1 次提交
-
-
由 Paul Mackerras 提交于
When a Book3S HV KVM guest is running, we need the host to be in single-thread mode, that is, all of the cores (or at least all of the cores where the KVM guest could run) to be running only one active hardware thread. This is because of the hardware restriction in POWER processors that all of the hardware threads in the core must be in the same logical partition. Complying with this restriction is much easier if, from the host kernel's point of view, only one hardware thread is active. This adds two hooks in the SMP hotplug code to allow the KVM code to make sure that secondary threads (i.e. hardware threads other than thread 0) cannot come online while any KVM guest exists. The KVM code still has to check that any core where it runs a guest has the secondary threads offline, but having done that check it can now be sure that they will not come online while the guest is running. Signed-off-by: NPaul Mackerras <paulus@samba.org> Acked-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: NAlexander Graf <agraf@suse.de>
-
- 19 9月, 2012 1 次提交
-
-
由 Zhao Chenhui 提交于
During suspend, all interrupts including IPI will be disabled. In this case, the suspend process will hang in SMP. To prevent this, pass the flag IRQF_NO_SUSPEND when requesting IPI irq. Signed-off-by: NZhao Chenhui <chenhui.zhao@freescale.com> Signed-off-by: NLi Yang <leoli@freescale.com> Acked-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: NKumar Gala <galak@kernel.crashing.org>
-
- 13 9月, 2012 1 次提交
-
-
由 Zhao Chenhui 提交于
In the case of cpu hotplug, the cpu_state should be set to CPU_UP_PREPARE when kicking cpu. Otherwise, the cpu_state is always CPU_DEAD after calling generic_set_cpu_dead(), which makes the delay in generic_cpu_die() not happen. Signed-off-by: NZhao Chenhui <chenhui.zhao@freescale.com> Signed-off-by: NKumar Gala <galak@kernel.crashing.org>
-
- 05 9月, 2012 1 次提交
-
-
由 Paul Mackerras 提交于
We have been observing hangs, both of KVM guest vcpu tasks and more generally, where a process that is woken doesn't properly wake up and continue to run, but instead sticks in TASK_WAKING state. This happens because the update of rq->wake_list in ttwu_queue_remote() is not ordered with the update of ipi_message in smp_muxed_ipi_message_pass(), and the reading of rq->wake_list in scheduler_ipi() is not ordered with the reading of ipi_message in smp_ipi_demux(). Thus it is possible for the IPI receiver not to see the updated rq->wake_list and therefore conclude that there is nothing for it to do. In order to make sure that anything done before smp_send_reschedule() is ordered before anything done in the resulting call to scheduler_ipi(), this adds barriers in smp_muxed_message_pass() and smp_ipi_demux(). The barrier in smp_muxed_message_pass() is a full barrier to ensure that there is a full ordering between the smp_send_reschedule() caller and scheduler_ipi(). In smp_ipi_demux(), we use xchg() rather than xchg_local() because xchg() includes release and acquire barriers. Using xchg() rather than xchg_local() makes sense given that ipi_message is not just accessed locally. This moves the barrier between setting the message and calling the cause_ipi() function into the individual cause_ipi implementations. Most of them -- those that used outb, out_8 or similar -- already had a full barrier because out_8 etc. include a sync before the MMIO store. This adds an explicit barrier in the two remaining cases. These changes made no measurable difference to the speed of IPIs as measured using a simple ping-pong latency test across two CPUs on different cores of a POWER7 machine. The analysis of the reason why processes were not waking up properly is due to Milton Miller. Cc: stable@vger.kernel.org # v3.0+ Reported-by: NMilton Miller <miltonm@bga.com> Signed-off-by: NPaul Mackerras <paulus@samba.org> Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
- 11 7月, 2012 1 次提交
-
-
由 Anton Blanchard 提交于
We have a request for a fast method of getting CPU and NUMA node IDs from userspace. This patch implements a getcpu VDSO function, similar to x86. Ben suggested we use SPRG3 which is userspace readable. SPRG3 can be modified by a KVM guest, so we save the SPRG3 value in the paca and restore it when transitioning from the guest to the host. I have a glibc patch that implements sched_getcpu on top of this. Testing on a POWER7: baseline: 538 cycles vdso: 30 cycles Signed-off-by: NAnton Blanchard <anton@samba.org> Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
- 03 7月, 2012 1 次提交
-
-
由 Yong Zhang 提交于
1) call_function.lock used in smp_call_function_many() is just to protect call_function.queue and &data->refs, cpu_online_mask is outside of the lock. And it's not necessary to protect cpu_online_mask, because data->cpumask is pre-calculate and even if a cpu is brougt up when calling arch_send_call_function_ipi_mask(), it's harmless because validation test in generic_smp_call_function_interrupt() will take care of it. 2) For cpu down issue, stop_machine() will guarantee that no concurrent smp_call_fuction() is processing. Signed-off-by: NYong Zhang <yong.zhang0@gmail.com> Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
- 05 6月, 2012 1 次提交
-
-
由 Yong Zhang 提交于
ipi_call_lock/unlock() lock resp. unlock call_function.lock. This lock protects only the call_function data structure itself, but it's completely unrelated to cpu_online_mask. The mask to which the IPIs are sent is calculated before call_function.lock is taken in smp_call_function_many(), so the locking around set_cpu_online() is pointless and can be removed. [ tglx: Massaged changelog ] Signed-off-by: NYong Zhang <yong.zhang0@gmail.com> Cc: ralf@linux-mips.org Cc: sshtylyov@mvista.com Cc: david.daney@cavium.com Cc: nikunj@linux.vnet.ibm.com Cc: paulmck@linux.vnet.ibm.com Cc: axboe@kernel.dk Cc: peterz@infradead.org Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: linuxppc-dev@lists.ozlabs.org Link: http://lkml.kernel.org/r/1338275765-3217-10-git-send-email-yong.zhang0@gmail.comAcked-by: NSrivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com> Acked-by: NPeter Zijlstra <peterz@infradead.org> Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
-
- 26 4月, 2012 2 次提交
-
-
由 Thomas Gleixner 提交于
Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Link: http://lkml.kernel.org/r/20120420124557.311212868@linutronix.de
-
由 Thomas Gleixner 提交于
Preparatory patch to make the idle thread allocation for secondary cpus generic. Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com> Cc: Matt Turner <mattst88@gmail.com> Cc: Russell King <linux@arm.linux.org.uk> Cc: Mike Frysinger <vapier@gentoo.org> Cc: Jesper Nilsson <jesper.nilsson@axis.com> Cc: Richard Kuo <rkuo@codeaurora.org> Cc: Tony Luck <tony.luck@intel.com> Cc: Hirokazu Takata <takata@linux-m32r.org> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: David Howells <dhowells@redhat.com> Cc: James E.J. Bottomley <jejb@parisc-linux.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Paul Mundt <lethal@linux-sh.org> Cc: David S. Miller <davem@davemloft.net> Cc: Chris Metcalf <cmetcalf@tilera.com> Cc: Richard Weinberger <richard@nod.at> Cc: x86@kernel.org Link: http://lkml.kernel.org/r/20120420124556.964170564@linutronix.de
-
- 29 3月, 2012 1 次提交
-
-
由 David Howells 提交于
Disintegrate asm/system.h for PowerPC. Signed-off-by: NDavid Howells <dhowells@redhat.com> Acked-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org> cc: linuxppc-dev@lists.ozlabs.org
-
- 22 12月, 2011 1 次提交
-
-
由 Kay Sievers 提交于
This moves the 'cpu sysdev_class' over to a regular 'cpu' subsystem and converts the devices to regular devices. The sysdev drivers are implemented as subsystem interfaces now. After all sysdev classes are ported to regular driver core entities, the sysdev implementation will be entirely removed from the kernel. Userspace relies on events and generic sysfs subsystem infrastructure from sysdev devices, which are made available with this conversion. Cc: Haavard Skinnemoen <hskinnemoen@gmail.com> Cc: Hans-Christian Egtvedt <egtvedt@samfundet.no> Cc: Tony Luck <tony.luck@intel.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Paul Mundt <lethal@linux-sh.org> Cc: "David S. Miller" <davem@davemloft.net> Cc: Chris Metcalf <cmetcalf@tilera.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Borislav Petkov <bp@amd64.org> Cc: Tigran Aivazian <tigran@aivazian.fsnet.co.uk> Cc: Len Brown <lenb@kernel.org> Cc: Zhang Rui <rui.zhang@intel.com> Cc: Dave Jones <davej@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Russell King <rmk+kernel@arm.linux.org.uk> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Arjan van de Ven <arjan@linux.intel.com> Cc: "Rafael J. Wysocki" <rjw@sisk.pl> Cc: "Srivatsa S. Bhat" <srivatsa.bhat@linux.vnet.ibm.com> Signed-off-by: NKay Sievers <kay.sievers@vrfy.org> Signed-off-by: NGreg Kroah-Hartman <gregkh@suse.de>
-
- 25 11月, 2011 1 次提交
-
-
由 Thomas Gleixner 提交于
IPI handlers cannot be threaded. Remove the obsolete IRQF_DISABLED flag (see commit e58aa3d2) while at it. Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
- 08 11月, 2011 1 次提交
-
-
由 Yong Zhang 提交于
Since commit [e58aa3d2: genirq: Run irq handlers with interrupts disabled], We run all interrupt handlers with interrupts disabled and we even check and yell when an interrupt handler returns with interrupts enabled (see commit [b738a50a: genirq: Warn when handler enables interrupts]). So now this flag is a NOOP and can be removed. Signed-off-by: NYong Zhang <yong.zhang0@gmail.com> Acked-by: NArnd Bergmann <arnd@arndb.de> Acked-by: NGeoff Levand <geoff@infradead.org> Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
- 01 11月, 2011 1 次提交
-
-
由 Paul Gortmaker 提交于
All these files were including module.h just for the basic EXPORT_SYMBOL infrastructure. We can shift them off to the export.h header which is a way smaller footprint and thus realize some compile time gains. Signed-off-by: NPaul Gortmaker <paul.gortmaker@windriver.com>
-
- 20 9月, 2011 1 次提交
-
-
由 Benjamin Herrenschmidt 提交于
This adds more generic support for doing CPU hotplug with a simple idle loop and no actual reset of the processors. The generic smp_generic_kick_cpu() does the hotplug bringup trick if the PACA shows that the CPU has already been started at boot and we provide an accessor for the CPU state. Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
- 27 7月, 2011 1 次提交
-
-
由 Arun Sharma 提交于
This allows us to move duplicated code in <asm/atomic.h> (atomic_inc_not_zero() for now) to <linux/atomic.h> Signed-off-by: NArun Sharma <asharma@fb.com> Reviewed-by: NEric Dumazet <eric.dumazet@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: David Miller <davem@davemloft.net> Cc: Eric Dumazet <eric.dumazet@gmail.com> Acked-by: NMike Frysinger <vapier@gentoo.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 12 7月, 2011 1 次提交
-
-
由 Paul Mackerras 提交于
This adds support for KVM running on 64-bit Book 3S processors, specifically POWER7, in hypervisor mode. Using hypervisor mode means that the guest can use the processor's supervisor mode. That means that the guest can execute privileged instructions and access privileged registers itself without trapping to the host. This gives excellent performance, but does mean that KVM cannot emulate a processor architecture other than the one that the hardware implements. This code assumes that the guest is running paravirtualized using the PAPR (Power Architecture Platform Requirements) interface, which is the interface that IBM's PowerVM hypervisor uses. That means that existing Linux distributions that run on IBM pSeries machines will also run under KVM without modification. In order to communicate the PAPR hypercalls to qemu, this adds a new KVM_EXIT_PAPR_HCALL exit code to include/linux/kvm.h. Currently the choice between book3s_hv support and book3s_pr support (i.e. the existing code, which runs the guest in user mode) has to be made at kernel configuration time, so a given kernel binary can only do one or the other. This new book3s_hv code doesn't support MMIO emulation at present. Since we are running paravirtualized guests, this isn't a serious restriction. With the guest running in supervisor mode, most exceptions go straight to the guest. We will never get data or instruction storage or segment interrupts, alignment interrupts, decrementer interrupts, program interrupts, single-step interrupts, etc., coming to the hypervisor from the guest. Therefore this introduces a new KVMTEST_NONHV macro for the exception entry path so that we don't have to do the KVM test on entry to those exception handlers. We do however get hypervisor decrementer, hypervisor data storage, hypervisor instruction storage, and hypervisor emulation assist interrupts, so we have to handle those. In hypervisor mode, real-mode accesses can access all of RAM, not just a limited amount. Therefore we put all the guest state in the vcpu.arch and use the shadow_vcpu in the PACA only for temporary scratch space. We allocate the vcpu with kzalloc rather than vzalloc, and we don't use anything in the kvmppc_vcpu_book3s struct, so we don't allocate it. We don't have a shared page with the guest, but we still need a kvm_vcpu_arch_shared struct to store the values of various registers, so we include one in the vcpu_arch struct. The POWER7 processor has a restriction that all threads in a core have to be in the same partition. MMU-on kernel code counts as a partition (partition 0), so we have to do a partition switch on every entry to and exit from the guest. At present we require the host and guest to run in single-thread mode because of this hardware restriction. This code allocates a hashed page table for the guest and initializes it with HPTEs for the guest's Virtual Real Memory Area (VRMA). We require that the guest memory is allocated using 16MB huge pages, in order to simplify the low-level memory management. This also means that we can get away without tracking paging activity in the host for now, since huge pages can't be paged or swapped. This also adds a few new exports needed by the book3s_hv code. Signed-off-by: NPaul Mackerras <paulus@samba.org> Signed-off-by: NAlexander Graf <agraf@suse.de>
-
- 08 7月, 2011 1 次提交
-
-
由 Becky Bruce 提交于
This is used to round-robin TLBCAM entries. Signed-off-by: NBecky Bruce <beckyb@kernel.crashing.org> Signed-off-by: NKumar Gala <galak@kernel.crashing.org>
-
- 29 6月, 2011 1 次提交
-
-
由 Scott Wood 提交于
Signed-off-by: NScott Wood <scottwood@freescale.com> Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
- 20 6月, 2011 1 次提交
-
-
由 Paul Mackerras 提交于
On many platforms (including pSeries), smp_ops->message_pass is always smp_muxed_ipi_message_pass. This changes arch/powerpc/kernel/smp.c so that if smp_ops->message_pass is NULL, it calls smp_muxed_ipi_message_pass directly. This means that a platform doesn't need to set both .message_pass and .cause_ipi, only one of them. It is a slight performance improvement in that it gets rid of an indirect function call at the expense of a predictable conditional branch. Signed-off-by: NPaul Mackerras <paulus@samba.org> Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
- 26 5月, 2011 1 次提交
-
-
由 Milton Miller 提交于
The cell iic interrupt controller has enough software caused interrupts to use a unique interrupt for each of the 4 messages powerpc uses. This means each interrupt gets its own irq action/data combination. Use the seperate, optimized, arch common ipi action functions registered via the helper smp_request_message_ipi instead passing the message as action data to a single action that then demultipexes to the required acton via a switch statement. smp_request_message_ipi will register the action as IRQF_PER_CPU and IRQF_DISABLED, and WARN if the allocation fails for some reason, so no need to print on that failure. It will return positive if the message will not be used by the kernel, in which case we can free the virq. In addition to elimiating inefficient code, this also corrects the error that a kernel built with kexec but without a debugger would not register the ipi for kdump to notify the other cpus of a crash. This also restores the debugger action to be static to kernel/smp.c. Signed-off-by: NMilton Miller <miltonm@bga.com> Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
- 19 5月, 2011 5 次提交
-
-
由 Milton Miller 提交于
Since there are only 4 messages, we can replace the atomic bit set (which uses atomic load reserve and store conditional sequence) with a byte stores to seperate bytes. We still have to perform a load reserve and store conditional sequence to avoid loosing messages on reception but we can do that with a single call to xchg. The do {} while and __BIG_ENDIAN specific mask testing was chosen by looking at the generated asm code. On gcc-4.4, the bit masking becomes a simple bit mask and test of the register returned from xchg without storing and loading the value to the stack like attempts with a union of bytes and an int (or worse, loading single bit constants from the constant pool into non-voliatle registers that had to be preseved on the stack). The do {} while avoids an unconditional branch to the end of the loop to test the entry / repeat condition of a while loop and instead optimises for the expected single iteration of the loop. We have a full mb() at the beginning to cover ordering between send, ipi, and receive so we can use xchg_local and forgo the further acquire and release barriers of xchg. Signed-off-by: NMilton Miller <miltonm@bga.com> Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
由 Milton Miller 提交于
Compile the new smp ipi mux and demux code only if a platform will make use of it. The new config is selected as required. The new cause_ipi smp op is only available conditionally to point out configs where the select is required; this makes setting the op an immediate fail instead of a deferred unresolved symbol at link. This also creates a new config for power surge powermac upgrade support that can be disabled in expert mode but is default on. I also removed the depends / default y on CONFIG_XICS since it is selected by PSERIES. Signed-off-by: NMilton Miller <miltonm@bga.com> Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
由 Milton Miller 提交于
Consolidate the mux and demux of ipi messages into smp.c and call a new smp_ops callback to actually trigger the ipi. The powerpc architecture code is optimised for having 4 distinct ipi triggers, which are mapped to 4 distinct messages (ipi many, ipi single, scheduler ipi, and enter debugger). However, several interrupt controllers only provide a single software triggered interrupt that can be delivered to each cpu. To resolve this limitation, each smp_ops implementation created a per-cpu variable that is manipulated with atomic bitops. Since these lines will be contended they are optimialy marked as shared_aligned and take a full cache line for each cpu. Distro kernels may have 2 or 3 of these in their config, each taking per-cpu space even though at most one will be in use. This consolidation removes smp_message_recv and replaces the single call actions cases with direct calls from the common message recognition loop. The complicated debugger ipi case with its muxed crash handling code is moved to debug_ipi_action which is now called from the demux code (instead of the multi-message action calling smp_message_recv). I put a call to reschedule_action to increase the likelyhood of correctly merging the anticipated scheduler_ipi() hook coming from the scheduler tree; that single required call can be inlined later. The actual message decode is a copy of the old pseries xics code with its memory barriers and cache line spacing, augmented with a per-cpu unsigned long based on the book-e doorbell code. The optional data is set via a callback from the implementation and is passed to the new cause-ipi hook along with the logical cpu number. While currently only the doorbell implemntation uses this data it should be almost zero cost to retrieve and pass it -- it adds a single register load for the argument from the same cache line to which we just completed a store and the register is dead on return from the call. I extended the data element from unsigned int to unsigned long in case some other code wanted to associate a pointer. The doorbell check_self is replaced by a call to smp_muxed_ipi_resend, conditioned on the CPU_DBELL feature. The ifdef guard could be relaxed to CONFIG_SMP but I left it with BOOKE for now. Also, the doorbell interrupt vector for book-e was not calling irq_enter and irq_exit, which throws off cpu accounting and causes code to not realize it is running in interrupt context. Add the missing calls. Signed-off-by: NMilton Miller <miltonm@bga.com> Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
由 Milton Miller 提交于
The only user of MSG_ALL_BUT_SELF in the whole kernel tree is powerpc, and it only uses it to start the debugger. Both debuggers always call smp_send_debugger_break with MSG_ALL_BUT_SELF, and only mpic can do anything more optimal than a loop over all online cpus, but all message passing implementations have to code for this special delivery target. Convert smp_send_debugger_break to take void and loop calling the smp_ops message_pass function for each of the other cpus in the online cpumask. Use raw_smp_processor_id() because we are either entering the debugger or trying to start kdump and the additional warning it not useful were it to trigger. Signed-off-by: NMilton Miller <miltonm@bga.com> Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
由 kerstin jonsson 提交于
commit c56e5853 breaks SMP support in PPC_47x chip. secondary_ti must be set to current thread info before callin kick_cpu or else start_secondary_47x will jump into void when trying to return to c-code. In the current setup secondary_ti is initialized before the CPU idle task is started and only the boot core will start. I am not sure this is the correct solution, but it makes SMP possible in my chip. Note! The HOTPLUG support probably need some fixing to, There is no trampoline code available in head_44x.S - start_secondary_resume? Signed-off-by: NKerstin Jonsson <kerstin.jonsson@ericsson.com> Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
- 04 5月, 2011 1 次提交
-
-
由 KOSAKI Motohiro 提交于
Adapt new API. Almost change is trivial. Most important change is the below line because we plan to change task->cpus_allowed implementation. - ctx->cpus_allowed = current->cpus_allowed; Signed-off-by: NKOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
- 20 4月, 2011 1 次提交
-
-
由 Michael Ellerman 提交于
When we start a cpu we use smp_ops->kick_cpu(), which currently returns void, it should be able to fail. Convert it to return int, and update all uses. Convert all the current error cases to return -ENOENT, which is what would eventually be returned by __cpu_up() currently when it doesn't detect the cpu as coming up in time. Signed-off-by: NMichael Ellerman <michael@ellerman.id.au> Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
- 14 4月, 2011 1 次提交
-
-
由 Peter Zijlstra 提交于
For future rework of try_to_wake_up() we'd like to push part of that function onto the CPU the task is actually going to run on. In order to do so we need a generic callback from the existing scheduler IPI. This patch introduces such a generic callback: scheduler_ipi() and implements it as a NOP. BenH notes: PowerPC might use this IPI on offline CPUs under rare conditions! Acked-by: NRussell King <rmk+kernel@arm.linux.org.uk> Acked-by: NMartin Schwidefsky <schwidefsky@de.ibm.com> Acked-by: NChris Metcalf <cmetcalf@tilera.com> Acked-by: NJesper Nilsson <jesper.nilsson@axis.com> Acked-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: NRalf Baechle <ralf@linux-mips.org> Reviewed-by: NFrank Rowand <frank.rowand@am.sony.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Nick Piggin <npiggin@kernel.dk> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: NIngo Molnar <mingo@elte.hu> Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/20110405152728.744338123@chello.nl
-
- 01 4月, 2011 6 次提交
-
-
由 Benjamin Herrenschmidt 提交于
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
由 Benjamin Herrenschmidt 提交于
Instead of creating idle threads at boot for all possible CPUs, we create them on demand, like x86 or ARM, and we properly call init_idle to re-initialize an idle thread when a CPU was unplugged and is now re-plugged. Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
由 Benjamin Herrenschmidt 提交于
Instead, keep it static, expose an accessor and use that from the PowerMac code. Avoids easy namespace collisions and will make it easier to consolidate with other implementations. Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
由 Benjamin Herrenschmidt 提交于
This allows us to stop abusing smp_ops->setup_cpu() for cleanup tasks that have to take place after the initial boot time CPU bringup. Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
由 Benjamin Herrenschmidt 提交于
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
由 Benjamin Herrenschmidt 提交于
Remove the last remnants of cpu_enable(), everybody uses the normal __cpu_up() path now Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-