- 23 1月, 2007 1 次提交
-
-
由 James Bottomley 提交于
The current PDA code, which went in in post 2.6.19 has a flaw in that it doesn't correctly cycle the GDT and %GS segment through the boot PDA, the CPU PDA and finally the per-cpu PDA. The bug generally doesn't show up if the boot CPU id is zero, but everything falls apart for a non zero boot CPU id. The basically kills voyager which is perfectly capable of doing non zero CPU id boots, so voyager currently won't boot without this. The fix is to be careful and actually do the GDT setups correctly. Signed-off-by: NJames Bottomley <James.Bottomley@SteelEye.com> Cc: Andi Kleen <ak@suse.de> Cc: Jeremy Fitzhardinge <jeremy@goop.org> Cc: Andrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 11 1月, 2007 1 次提交
-
-
由 Vivek Goyal 提交于
o Misc smpboot/cpu hotplug path cleanups. I did those to supress the warnings generated by MODPOST. These warnings are visible only if CONFIG_RELOCATABLE=y. o CONFIG_RELOCATABLE compiles the kernel with --emit-relocs option. This option retains relocation information in vmlinux file and MODPOST is quick to spit out "Section mismatch" warnings. o This patch fixes some of those warnings. Many of the functions in smpboot case are __devinit type and they in turn accesses text/data which if of type __cpuinit. Now if CONFIG_HOTPLUG=y and CONFIG_HOTPLUG_CPU=n then we end up in cases where a function in .text segment is calling another function in .init.text segment and MODPOST emits warning. WARNING: vmlinux - Section mismatch: reference to .init.text:identify_cpu from .text between 'smp_store_cpu_info' (at offset 0xc011020d) and 'do_boot_cpu' WARNING: vmlinux - Section mismatch: reference to .init.text:init_gdt from .text between 'do_boot_cpu' (at offset 0xc01102ca) and '__cpu_up' WARNING: vmlinux - Section mismatch: reference to .init.text:print_cpu_info from .text between 'do_boot_cpu' (at offset 0xc01105d0) and '__cpu_up' o It also fixes the issues where CONFIG_HOTPLUG_CPU=y and start_secondary() is calling smp_callin() which in-turn calls synchronize_tsc_ap() which is of type __init. This should have meant broken CPU hotplug. WARNING: vmlinux - Section mismatch: reference to .init.data: from .text between 'start_secondary' (at offset 0xc011603f) and 'initialize_secondary' WARNING: vmlinux - Section mismatch: reference to .init.data: from .text between 'MP_processor_info' (at offset 0xc0116a4f) and 'mp_register_lapic' WARNING: vmlinux - Section mismatch: reference to .init.data: from .text between 'MP_processor_info' (at offset 0xc0116a4f) and 'mp_register_lapic' Signed-off-by: NVivek Goyal <vgoyal@in.ibm.com> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NAndi Kleen <ak@suse.de>
-
- 06 1月, 2007 1 次提交
-
-
由 Vivek Goyal 提交于
o Currently synchronize_tsc_ap() is of type __init. It is called by smp_callin() which is of type __cpuinit. So synchronize_tsc_ap() should be of type __cpuinit. o Modpost generates warnings for i386 if CONFIG_RELOCATABLE=y and CONFIG_HOTPLUG_CPU=y WARNING: vmlinux - Section mismatch: reference to .init.data: from .text between 'start_secondary' (at offset 0xc01164dc) and 'initialize_secondary' WARNING: vmlinux - Section mismatch: reference to .init.data: from .text between 'start_secondary' (at offset 0xc01164e8) and 'initialize_secondary' o tsc is of type __initdata. It should be of type __cpuinitdata. Signed-off-by: NVivek Goyal <vgoyal@in.ibm.com> Cc: Andi Kleen <ak@suse.de> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
- 14 12月, 2006 1 次提交
-
-
由 Andrew Morton 提交于
#ifdef CONFIG_SMP in a file which isn't compiled in non-SMP kernels. Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
- 10 12月, 2006 1 次提交
-
-
由 Randy Dunlap 提交于
oprofile uses smp_num_siblings without testing for CONFIG_X86_HT. I looked at modifying oprofile, but this way is cleaner & simpler and I didn't see a good reason not to just export it when CONFIG_SMP. WARNING: "smp_num_siblings" [arch/i386/oprofile/oprofile.ko] undefined! Signed-off-by: NRandy Dunlap <randy.dunlap@oracle.com> Signed-off-by: NAndi Kleen <ak@suse.de>
-
- 09 12月, 2006 1 次提交
-
-
由 Shaohua Li 提交于
In VMSPLIT mode, kernel PGD might have more entries than user space. Acked-by: NJens Axboe <jens.axboe@oracle.com> Signed-off-by: NShaohua Li <shaohua.li@intel.com> Cc: Andi Kleen <ak@suse.de> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
- 07 12月, 2006 4 次提交
-
-
由 Adrian Bunk 提交于
- remove the write-only local variable "bandwidth" - don't set "max_cache_size" in the (cachesize < 0) case: that's already handled in kernel/sched.c:measure_migration_cost() Signed-off-by: NAdrian Bunk <bunk@stusta.de> Signed-off-by: NAndi Kleen <ak@suse.de> Acked-by: NIngo Molnar <mingo@elte.hu> Signed-off-by: NAndrew Morton <akpm@osdl.org>
-
由 Siddha, Suresh B 提交于
Move the irqbalance quirks for E7320/E7520/E7525(Errata 23 in http://download.intel.com/design/chipsets/specupdt/30304203.pdf) to early quirks. And add a PCI quirk for these platforms to check(which happens very late during the boot) if the APIC routing is indeed set to default flat mode. This fixes the breakage(in x86_64) of this quirk due to cpu hotplug which selects physical mode instead of the logical flat(as needed for this errata workaround). Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: NAndi Kleen <ak@suse.de> Cc: Andi Kleen <ak@suse.de> Cc: "Li, Shaohua" <shaohua.li@intel.com> Signed-off-by: NAndrew Morton <akpm@osdl.org>
-
由 Rusty Russell 提交于
Create a paravirt.h header for all the critical operations which need to be replaced with hypervisor calls, and include that instead of defining native operations, when CONFIG_PARAVIRT. This patch does the dumbest possible replacement of paravirtualized instructions: calls through a "paravirt_ops" structure. Currently these are function implementations of native hardware: hypervisors will override the ops structure with their own variants. All the pv-ops functions are declared "fastcall" so that a specific register-based ABI is used, to make inlining assember easier. And: +From: Andy Whitcroft <apw@shadowen.org> The paravirt ops introduce a 'weak' attribute onto memory_setup(). Code ordering leads to the following warnings on x86: arch/i386/kernel/setup.c:651: warning: weak declaration of `memory_setup' after first use results in unspecified behavior Move memory_setup() to avoid this. Signed-off-by: NRusty Russell <rusty@rustcorp.com.au> Signed-off-by: NChris Wright <chrisw@sous-sol.org> Signed-off-by: NAndi Kleen <ak@suse.de> Cc: Jeremy Fitzhardinge <jeremy@goop.org> Cc: Zachary Amsden <zach@vmware.com> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NAndy Whitcroft <apw@shadowen.org>
-
由 Jeremy Fitzhardinge 提交于
When a CPU is brought up, a PDA and GDT are allocated for it. The GDT's __KERNEL_PDA entry is pointed to the allocated PDA memory, so that all references using this segment descriptor will refer to the PDA. This patch rearranges CPU initialization a bit, so that the GDT/PDA are set up as early as possible in cpu_init(). Also for secondary CPUs, GDT+PDA are preallocated and initialized so all the secondary CPU needs to do is set up the ldt and load %gs. This will be important once smp_processor_id() and current use the PDA. In all cases, the PDA is set up in head.S, before a CPU starts running C code, so the PDA is always available. Signed-off-by: NJeremy Fitzhardinge <jeremy@xensource.com> Signed-off-by: NAndi Kleen <ak@suse.de> Cc: Chuck Ebbert <76306.1226@compuserve.com> Cc: Zachary Amsden <zach@vmware.com> Cc: Jan Beulich <jbeulich@novell.com> Cc: Andi Kleen <ak@suse.de> Cc: James Bottomley <James.Bottomley@SteelEye.com> Cc: Matt Tolentino <matthew.e.tolentino@intel.com> Signed-off-by: NAndrew Morton <akpm@osdl.org>
-
- 22 11月, 2006 1 次提交
-
-
由 David Howells 提交于
Fix up for make allyesconfig. Signed-Off-By: NDavid Howells <dhowells@redhat.com>
-
- 04 10月, 2006 1 次提交
-
-
由 Keith Mannthey 提交于
This allows numaq to properly align cpus to their given node during boot. Pass logical apicid to apicid_to_node and allow the summit sub-arch to use physical apicid (hard_smp_processor_id()). Tested against numaq and summit based systems with no issues. Signed-off-by: NKeith Mannthey <kmannth@us.ibm.com> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
- 02 10月, 2006 1 次提交
-
-
由 Greg Banks 提交于
cpumask: ensure that node_to_cpumask() is available to modules for all supported combinations of architecture and CONFIG_NUMA. Signed-off-by: NGreg Banks <gnb@melbourne.sgi.com> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
- 01 10月, 2006 1 次提交
-
-
由 Peter Zijlstra 提交于
All on stack DECLARE_COMPLETIONs should be replaced by: DECLARE_COMPLETION_ONSTACK Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Acked-by: NIngo Molnar <mingo@elte.hu> Acked-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
- 30 9月, 2006 1 次提交
-
-
由 keith mannthey 提交于
Convert the i386 summit subarch apicid_to_node to use node information provided by the SRAT. It was discussed a little on LKML a few weeks ago and was seen as an acceptable fix. The current way of obtaining the nodeid static inline int apicid_to_node(int logical_apicid) { return logical_apicid >> 5; } is just not correct for all summit systems/bios. Assuming the apicid matches the Linux node number require a leap of faith that the bios mapped out the apicids a set way. Modern summit HW (IBM x460) does not layout its bios in the manner for various reasons and is unable to boot i386 numa. The best way to get the correct apicid to node information is from the SRAT table during boot. It lays out what apicid belongs to what node. I use this information to create a table for use at run time. Signed-off-by: NKeith Mannthey <kmannth@us.ibm.com> Cc: Andi Kleen <ak@suse.de> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
- 26 9月, 2006 4 次提交
-
-
由 Dave Jones 提交于
We have a test that looks for invalid pairings of certain athlon/durons that weren't designed for SMP, and taint accordingly (with 'S') if we find such a configuration. However, this test shouldn't fire if there's only a single CPU present. It's perfectly valid for an SMP kernel to boot on UP hardware for example. AK: changed to num_possible_cpus() Signed-off-by: NDave Jones <davej@redhat.com> Signed-off-by: NAndi Kleen <ak@suse.de>
-
由 Rusty Russell 提交于
This patch replaces the open-coded early commandline parsing throughout the i386 boot code with the generic mechanism (already used by ppc, powerpc, ia64 and s390). The code was inconsistent with whether it deletes the option from the cmdline or not, meaning some of these will get passed through the environment into init. This transformation is mainly mechanical, but there are some notable parts: 1) Grammar: s/linux never set's it up/linux never sets it up/ 2) Remove hacked-in earlyprintk= option scanning. When someone actually implements CONFIG_EARLY_PRINTK, then they can use early_param(). [AK: actually it is implemented, but I'm adding the early_param it in the next x86-64 patch] 3) Move declaration of generic_apic_probe() from setup.c into asm/apic.h 4) Various parameters now moved into their appropriate files (thanks Andi). 5) All parse functions which examine arg need to check for NULL, except one where it has subtle humor value. AK: readded acpi_sci handling which was completely dropped AK: moved some more variables into acpi/boot.c Cc: len.brown@intel.com Signed-off-by: NRusty Russell <rusty@rustcorp.com.au> Signed-off-by: NAndi Kleen <ak@suse.de>
-
由 Shaohua Li 提交于
Making NMI suspend/resume work with SMP. We use CPU hotplug to offline APs in SMP suspend/resume. Only BSP executes sysdev's .suspend/.resume method. APs should follow CPU hotplug code path. And: +From: Don Zickus <dzickus@redhat.com> Makes the start/stop paths of nmi watchdog more robust to handle the suspend/resume cases more gracefully. AK: I merged the two patches together Signed-off-by: NShaohua Li <shaohua.li@intel.com> Signed-off-by: NAndi Kleen <ak@suse.de> Cc: Don Zickus <dzickus@redhat.com> Cc: Andi Kleen <ak@muc.de> Signed-off-by: NAndrew Morton <akpm@osdl.org>
-
由 keith mannthey 提交于
If there is only 1 node in the system cpus should think they are apart of some other node. If cases where a real numa system boots the Flat numa option make sure the cpus don't claim to be apart on a non-existent node. Signed-off-by: NKeith Mannthey <kmannth@us.ibm.com> Cc: Andy Whitcroft <apw@shadowen.org> Cc: Dave Hansen <haveblue@us.ibm.com> Cc: Andi Kleen <ak@suse.de> Cc: <stable@kernel.org> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
- 01 8月, 2006 1 次提交
-
-
由 Andrew Morton 提交于
- Move the tsc synchronisation variables into a struct, mark it __initdata - local `realdelta' wants to be 64-bit - Print the skew for negative skews, as well as for positive ones - remove dead code Cc: john stultz <johnstul@us.ibm.com> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
- 01 7月, 2006 1 次提交
-
-
由 Jörn Engel 提交于
Signed-off-by: NJörn Engel <joern@wohnheim.fh-wedel.de> Signed-off-by: NAdrian Bunk <bunk@stusta.de>
-
- 28 6月, 2006 3 次提交
-
-
由 Siddha, Suresh B 提交于
sysfs entries 'sched_mc_power_savings' and 'sched_smt_power_savings' in /sys/devices/system/cpu/ control the MC/SMT power savings policy for the scheduler. Based on the values (1-enable, 0-disable) for these controls, sched groups cpu power will be determined for different domains. When power savings policy is enabled and under light load conditions, scheduler will minimize the physical packages/cpu cores carrying the load and thus conserving power(with a perf impact based on the workload characteristics... see OLS 2005 CMP kernel scheduler paper for more details..) Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Nick Piggin <nickpiggin@yahoo.com.au> Cc: Con Kolivas <kernel@kolivas.org> Cc: "Chen, Kenneth W" <kenneth.w.chen@intel.com> Cc: "David S. Miller" <davem@davemloft.net> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
由 Rohit Seth 提交于
Move the phys_core_id and cpu_core_id to cpuinfo_x86 structure. Similar patch for x86_64 is already accepted by Andi earlier this week. [akpm@osdl.org: fix warning] Signed-off-by: NRohit Seth <rohitseth@google.com> Cc: Andi Kleen <ak@muc.de> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
由 Shaohua Li 提交于
The patch fixes two issues: 1. cpu_init is called with interrupt disabled. Allocating gdt table there isn't good at runtime. 2. gdt table page cause memory leak in CPU hotplug case. Signed-off-by: NShaohua Li <shaohua.li@intel.com> Cc: Ashok Raj <ashok.raj@intel.com> Cc: Zachary Amsden <zach@vmware.com> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
- 27 6月, 2006 1 次提交
-
-
由 Don Zickus 提交于
Misc header cleanup for nmi watchdog. Signed-off-by: NDon Zickus <dzickus@redhat.com> Signed-off-by: NAndi Kleen <ak@suse.de> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
- 26 6月, 2006 1 次提交
-
-
由 Andreas Mohr 提交于
Add cpu_relax() to various smpboot.c init loops. cpu_relax() always implies a barrier (according to Arjan), so remove those as well. Signed-off-by: NAndreas Mohr <andi@lisas.de> Cc: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
- 28 4月, 2006 1 次提交
-
-
由 Dave Jones 提交于
These messages are kinda silly.. CPU#0 had 0 usecs TSC skew, fixed it up. CPU#1 had 0 usecs TSC skew, fixed it up. inspired from: http://bugzilla.kernel.org/attachment.cgi?id=7713&action=viewSigned-off-by: NDave Jones <davej@redhat.com> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
- 28 3月, 2006 1 次提交
-
-
由 Siddha, Suresh B 提交于
Add a new sched domain for representing multi-core with shared caches between cores. Consider a dual package system, each package containing two cores and with last level cache shared between cores with in a package. If there are two runnable processes, with this appended patch those two processes will be scheduled on different packages. On such systems, with this patch we have observed 8% perf improvement with specJBB(2 warehouse) benchmark and 35% improvement with CFP2000 rate(with 2 users). This new domain will come into play only on multi-core systems with shared caches. On other systems, this sched domain will be removed by domain degeneration code. This new domain can be also used for implementing power savings policy (see OLS 2005 CMP kernel scheduler paper for more details.. I will post another patch for power savings policy soon) Most of the arch/* file changes are for cpu_coregroup_map() implementation. Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
- 26 3月, 2006 1 次提交
-
-
由 Ashok Raj 提交于
- Moved check for online cpu out of smp_prepare_cpu() - Moved default declaration of smp_prepare_cpu() to kernel/cpu.c - Removed lock_cpu_hotplug() from smp_prepare_cpu() to around it, since its called from cpu_up() as well now. - Removed clearing from cpu_present_map during cpu_offline as it breaks using cpu_up() directly during a subsequent online operation. Signed-off-by: NAshok Raj <ashok.raj@intel.com> Cc: Srivatsa Vaddagiri <vatsa@in.ibm.com> Cc: "Li, Shaohua" <shaohua.li@intel.com> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
- 23 3月, 2006 1 次提交
-
-
由 Gerd Hoffmann 提交于
Implement SMP alternatives, i.e. switching at runtime between different code versions for UP and SMP. The code can patch both SMP->UP and UP->SMP. The UP->SMP case is useful for CPU hotplug. With CONFIG_CPU_HOTPLUG enabled the code switches to UP at boot time and when the number of CPUs goes down to 1, and switches to SMP when the number of CPUs goes up to 2. Without CONFIG_CPU_HOTPLUG or on non-SMP-capable systems the code is patched once at boot time (if needed) and the tables are released afterwards. The changes in detail: * The current alternatives bits are moved to a separate file, the SMP alternatives code is added there. * The patch adds some new elf sections to the kernel: .smp_altinstructions like .altinstructions, also contains a list of alt_instr structs. .smp_altinstr_replacement like .altinstr_replacement, but also has some space to save original instruction before replaving it. .smp_locks list of pointers to lock prefixes which can be nop'ed out on UP. The first two are used to replace more complex instruction sequences such as spinlocks and semaphores. It would be possible to deal with the lock prefixes with that as well, but by handling them as special case the table sizes become much smaller. * The sections are page-aligned and padded up to page size, so they can be free if they are not needed. * Splitted the code to release init pages to a separate function and use it to release the elf sections if they are unused. Signed-off-by: NGerd Hoffmann <kraxel@suse.de> Signed-off-by: NChuck Ebbert <76306.1226@compuserve.com> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
- 17 3月, 2006 1 次提交
-
-
由 Srivatsa Vaddagiri 提交于
Bryce reported a bug wherein offlining CPU0 (on x86 box) and then subsequently onlining it resulted in a lockup. On x86, CPU0 is never offlined. The subsequent attempt to online CPU0 doesn't take that into account. It actually tries to bootup the already booted CPU. Following patch fixes the problem (as acknowledged by Bryce). Please consider for inclusion in 2.6.16. Check if cpu is already online. Signed-off-by: NSrivatsa Vaddagiri <vatsa@in.ibm.com> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
- 25 2月, 2006 1 次提交
-
-
由 James Bottomley 提交于
Recent GDT changes broke the SMP boot sequence if the booting CPU is numbered anything other than zero. There's also a subtle source of error in that the boot time CPU now uses cpu_gdt_table (which is actually the GDT for booting CPUs in head.S). This patch fixes both problems by making GDT descriptors themselves allocated from a per_cpu area and switching to them in cpu_init(), which now means that cpu_gdt_table is exclusively used for booting CPUs again. Signed-off-by: NJames Bottomley <James.Bottomley@SteelEye.com> Cc: Zachary Amsden <zach@vmware.com> Cc: Matt Tolentino <metolent@snoqualmie.dp.intel.com> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
- 11 2月, 2006 1 次提交
-
-
由 Andrew Morton 提交于
Initialising cpu_possible_map to all-ones with CONFIG_HOTPLUG_CPU means that a) All for_each_cpu() loops will iterate across all NR_CPUS CPUs, rather than over possible ones. That can be quite expensive. b) Soon we'll be allocating per-cpu areas only for possible CPUs. So with CPU_MASK_ALL, we'll be wasting memory. I also switched voyager over to not use CPU_MASK_ALL in the non-CPU-hotplug case. Should be OK.. I note that parisc is also using CPU_MASK_ALL. Suggest that it stop doing that. Cc: James Bottomley <James.Bottomley@steeleye.com> Cc: Kyle McMartin <kyle@mcmartin.ca> Cc: Paul Jackson <pj@sgi.com> Cc: Ashok Raj <ashok.raj@intel.com> Cc: Zwane Mwaikambo <zwane@linuxpower.ca> Cc: Paul Jackson <pj@sgi.com> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
- 13 1月, 2006 2 次提交
-
-
由 akpm@osdl.org 提交于
) From: Al Viro <viro@ftp.linux.org.uk> task_pt_regs() needs the same offset-by-8 to match copy_thread() Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
由 akpm@osdl.org 提交于
) From: Ingo Molnar <mingo@elte.hu> This is the latest version of the scheduler cache-hot-auto-tune patch. The first problem was that detection time scaled with O(N^2), which is unacceptable on larger SMP and NUMA systems. To solve this: - I've added a 'domain distance' function, which is used to cache measurement results. Each distance is only measured once. This means that e.g. on NUMA distances of 0, 1 and 2 might be measured, on HT distances 0 and 1, and on SMP distance 0 is measured. The code walks the domain tree to determine the distance, so it automatically follows whatever hierarchy an architecture sets up. This cuts down on the boot time significantly and removes the O(N^2) limit. The only assumption is that migration costs can be expressed as a function of domain distance - this covers the overwhelming majority of existing systems, and is a good guess even for more assymetric systems. [ People hacking systems that have assymetries that break this assumption (e.g. different CPU speeds) should experiment a bit with the cpu_distance() function. Adding a ->migration_distance factor to the domain structure would be one possible solution - but lets first see the problem systems, if they exist at all. Lets not overdesign. ] Another problem was that only a single cache-size was used for measuring the cost of migration, and most architectures didnt set that variable up. Furthermore, a single cache-size does not fit NUMA hierarchies with L3 caches and does not fit HT setups, where different CPUs will often have different 'effective cache sizes'. To solve this problem: - Instead of relying on a single cache-size provided by the platform and sticking to it, the code now auto-detects the 'effective migration cost' between two measured CPUs, via iterating through a wide range of cachesizes. The code searches for the maximum migration cost, which occurs when the working set of the test-workload falls just below the 'effective cache size'. I.e. real-life optimized search is done for the maximum migration cost, between two real CPUs. This, amongst other things, has the positive effect hat if e.g. two CPUs share a L2/L3 cache, a different (and accurate) migration cost will be found than between two CPUs on the same system that dont share any caches. (The reliable measurement of migration costs is tricky - see the source for details.) Furthermore i've added various boot-time options to override/tune migration behavior. Firstly, there's a blanket override for autodetection: migration_cost=1000,2000,3000 will override the depth 0/1/2 values with 1msec/2msec/3msec values. Secondly, there's a global factor that can be used to increase (or decrease) the autodetected values: migration_factor=120 will increase the autodetected values by 20%. This option is useful to tune things in a workload-dependent way - e.g. if a workload is cache-insensitive then CPU utilization can be maximized by specifying migration_factor=0. I've tested the autodetection code quite extensively on x86, on 3 P3/Xeon/2MB, and the autodetected values look pretty good: Dual Celeron (128K L2 cache): --------------------- migration cost matrix (max_cache_size: 131072, cpu: 467 MHz): --------------------- [00] [01] [00]: - 1.7(1) [01]: 1.7(1) - --------------------- cacheflush times [2]: 0.0 (0) 1.7 (1784008) --------------------- Here the slow memory subsystem dominates system performance, and even though caches are small, the migration cost is 1.7 msecs. Dual HT P4 (512K L2 cache): --------------------- migration cost matrix (max_cache_size: 524288, cpu: 2379 MHz): --------------------- [00] [01] [02] [03] [00]: - 0.4(1) 0.0(0) 0.4(1) [01]: 0.4(1) - 0.4(1) 0.0(0) [02]: 0.0(0) 0.4(1) - 0.4(1) [03]: 0.4(1) 0.0(0) 0.4(1) - --------------------- cacheflush times [2]: 0.0 (33900) 0.4 (448514) --------------------- Here it can be seen that there is no migration cost between two HT siblings (CPU#0/2 and CPU#1/3 are separate physical CPUs). A fast memory system makes inter-physical-CPU migration pretty cheap: 0.4 msecs. 8-way P3/Xeon [2MB L2 cache]: --------------------- migration cost matrix (max_cache_size: 2097152, cpu: 700 MHz): --------------------- [00] [01] [02] [03] [04] [05] [06] [07] [00]: - 19.2(1) 19.2(1) 19.2(1) 19.2(1) 19.2(1) 19.2(1) 19.2(1) [01]: 19.2(1) - 19.2(1) 19.2(1) 19.2(1) 19.2(1) 19.2(1) 19.2(1) [02]: 19.2(1) 19.2(1) - 19.2(1) 19.2(1) 19.2(1) 19.2(1) 19.2(1) [03]: 19.2(1) 19.2(1) 19.2(1) - 19.2(1) 19.2(1) 19.2(1) 19.2(1) [04]: 19.2(1) 19.2(1) 19.2(1) 19.2(1) - 19.2(1) 19.2(1) 19.2(1) [05]: 19.2(1) 19.2(1) 19.2(1) 19.2(1) 19.2(1) - 19.2(1) 19.2(1) [06]: 19.2(1) 19.2(1) 19.2(1) 19.2(1) 19.2(1) 19.2(1) - 19.2(1) [07]: 19.2(1) 19.2(1) 19.2(1) 19.2(1) 19.2(1) 19.2(1) 19.2(1) - --------------------- cacheflush times [2]: 0.0 (0) 19.2 (19281756) --------------------- This one has huge caches and a relatively slow memory subsystem - so the migration cost is 19 msecs. Signed-off-by: NIngo Molnar <mingo@elte.hu> Signed-off-by: NAshok Raj <ashok.raj@intel.com> Signed-off-by: NKen Chen <kenneth.w.chen@intel.com> Cc: <wilder@us.ibm.com> Signed-off-by: NJohn Hawkes <hawkes@sgi.com> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
- 07 1月, 2006 1 次提交
-
-
由 Zachary Amsden 提交于
Make GDT page aligned and page padded to support running inside of a hypervisor. This prevents false sharing of the GDT page with other hot data, which is not allowed in Xen, and causes performance problems in VMware. Rather than go back to the old method of statically allocating the GDT (which wastes unneded space for non-present CPUs), the GDT for APs is allocated dynamically. Signed-off-by: NZachary Amsden <zach@vmware.com> Cc: "Seth, Rohit" <rohit.seth@intel.com> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
- 13 12月, 2005 1 次提交
-
-
由 Shaohua Li 提交于
Disabling LAPIC timer isn't sufficient. In some situations, such as we enabled NMI watchdog, there is still unexpected interrupt (such as NMI) invoked in offline CPU. This also avoids offline CPU receives spurious interrupt and anything similar. Signed-off-by: NShaohua Li <shaohua.li@intel.com> Signed-off-by: NAndi Kleen <ak@suse.de> Acked-by: N"Seth, Rohit" <rohit.seth@intel.com> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
- 15 11月, 2005 1 次提交
-
-
由 Siddha, Suresh B 提交于
Fields obtained through cpuid vector 0x1(ebx[16:23]) and vector 0x4(eax[14:25], eax[26:31]) indicate the maximum values and might not always be the same as what is available and what OS sees. So make sure "siblings" and "cpu cores" values in /proc/cpuinfo reflect the values as seen by OS instead of what cpuid instruction says. This will also fix the buggy BIOS cases (for example where cpuid on a single core cpu says there are "2" siblings, even when HT is disabled in the BIOS. http://bugzilla.kernel.org/show_bug.cgi?id=4359) Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: NAndi Kleen <ak@suse.de> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
- 09 11月, 2005 1 次提交
-
-
由 Nick Piggin 提交于
Run idle threads with preempt disabled. Also corrected a bugs in arm26's cpu_idle (make it actually call schedule()). How did it ever work before? Might fix the CPU hotplugging hang which Nigel Cunningham noted. We think the bug hits if the idle thread is preempted after checking need_resched() and before going to sleep, then the CPU offlined. After calling stop_machine_run, the CPU eventually returns from preemption and into the idle thread and goes to sleep. The CPU will continue executing previous idle and have no chance to call play_dead. By disabling preemption until we are ready to explicitly schedule, this bug is fixed and the idle threads generally become more robust. From: alexs <ashepard@u.washington.edu> PPC build fix From: Yoichi Yuasa <yuasa@hh.iij4u.or.jp> MIPS build fix Signed-off-by: NNick Piggin <npiggin@suse.de> Signed-off-by: NYoichi Yuasa <yuasa@hh.iij4u.or.jp> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
- 07 11月, 2005 1 次提交
-
-
由 Adrian Bunk 提交于
EXPORT_SYMBOL's for phys_proc_id and cpu_core_id were added this year but never used. Signed-off-by: NAdrian Bunk <bunk@stusta.de> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-