- 16 3月, 2015 1 次提交
-
-
由 Len Brown 提交于
sched/idle/x86: Restore mwait_idle() to fix boot hangs, to improve power savings and to improve performance In Linux-3.9 we removed the mwait_idle() loop: 69fb3676 ("x86 idle: remove mwait_idle() and "idle=mwait" cmdline param") The reasoning was that modern machines should be sufficiently happy during the boot process using the default_idle() HALT loop, until cpuidle loads and either acpi_idle or intel_idle invoke the newer MWAIT-with-hints idle loop. But two machines reported problems: 1. Certain Core2-era machines support MWAIT-C1 and HALT only. MWAIT-C1 is preferred for optimal power and performance. But if they support just C1, cpuidle never loads and so they use the boot-time default idle loop forever. 2. Some laptops will boot-hang if HALT is used, but will boot successfully if MWAIT is used. This appears to be a hidden assumption in BIOS SMI, that is presumably valid on the proprietary OS where the BIOS was validated. https://bugzilla.kernel.org/show_bug.cgi?id=60770 So here we effectively revert the patch above, restoring the mwait_idle() loop. However, we don't bother restoring the idle=mwait cmdline parameter, since it appears to add no value. Maintainer notes: For 3.9, simply revert 69fb3676 for 3.10, patch -F3 applies, fuzz needed due to __cpuinit use in context For 3.11, 3.12, 3.13, this patch applies cleanly Tested-by: NMike Galbraith <bitbucket@online.de> Signed-off-by: NLen Brown <len.brown@intel.com> Acked-by: NMike Galbraith <bitbucket@online.de> Cc: <stable@vger.kernel.org> # 3.9+ Cc: Borislav Petkov <bp@alien8.de> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Ian Malone <ibmalone@gmail.com> Cc: Josh Boyer <jwboyer@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/345254a551eb5a6a866e048d7ab570fd2193aca4.1389763084.git.len.brown@intel.com [ Ported to recent kernels. ] Signed-off-by: NIngo Molnar <mingo@kernel.org>
-
- 03 9月, 2014 3 次提交
-
-
由 Oleg Nesterov 提交于
Cosmetic, but I think thread.fpu_counter should be initialized in arch_dup_task_struct() too, along with other "fpu" variables. And probably it make sense to turn it into thread.fpu->counter. Signed-off-by: NOleg Nesterov <oleg@redhat.com> Link: http://lkml.kernel.org/r/20140902175730.GA21669@redhat.comReviewed-by: NSuresh Siddha <sbsiddha@gmail.com> Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>
-
由 Oleg Nesterov 提交于
Cosmetic, but imho memset(&dst->thread.fpu, 0) is not good simply because it hides the (important) usage of ->has_fpu/etc from grep. Change this code to initialize the members explicitly. And note that ->last_cpu = 0 looks simply wrong, this can confuse fpu_lazy_restore() if per_cpu(fpu_owner_task, 0) has already exited and copy_process() re-allocated the same task_struct. Fortunately this is not actually possible because child->fpu_counter == 0 and thus fpu_lazy_restore() will not be called, but still this is not clean/robust. Signed-off-by: NOleg Nesterov <oleg@redhat.com> Link: http://lkml.kernel.org/r/20140902175727.GA21666@redhat.comReviewed-by: NSuresh Siddha <sbsiddha@gmail.com> Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>
-
由 Oleg Nesterov 提交于
arch_dup_task_struct() copies thread.fpu if fpu_allocated(), this looks suboptimal and misleading. Say, a forking process could use FPU only once in a signal handler but now tsk_used_math(src) == F, in this case the child gets a copy of fpu->state for no reason. The child won't use the saved registers anyway even if it starts to use FPU, this can only avoid fpu_alloc() in do_device_not_available(). Change this code to check tsk_used_math(current) instead. We still need to clear fpu->has_fpu/state, we could do this memset(0) under fpu_allocated() check but I think this doesn't make sense. See also the next change. use_eager_fpu() assumes that fpu_allocated() is always true, but a forking task (and thus its child) must always have PF_USED_MATH set, otherwise the child can either use FPU without used_math() (note that switch_fpu_prepare() doesn't do stts() in this case), or it will be killed by do_device_not_available()->BUG_ON(use_eager_fpu). Signed-off-by: NOleg Nesterov <oleg@redhat.com> Link: http://lkml.kernel.org/r/20140902175723.GA21659@redhat.comReviewed-by: NSuresh Siddha <sbsiddha@gmail.com> Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>
-
- 30 5月, 2014 1 次提交
-
-
由 Fenghua Yu 提交于
In standard form, each state is saved in the xsave area in fixed offset. But in compacted form, offset of each saved state only can be calculated during run time because some xstates may not be enabled and saved. We define kernel API get_xsave_addr() returns address of a given state saved in a xsave area. It can be called in kernel to get address of each xstate in xsave area in either standard format or compacted format. It's useful when kernel wants to directly access each state in xsave area. Signed-off-by: NFenghua Yu <fenghua.yu@intel.com> Link: http://lkml.kernel.org/r/1401387164-43416-17-git-send-email-fenghua.yu@intel.comSigned-off-by: NH. Peter Anvin <hpa@linux.intel.com>
-
- 11 2月, 2014 1 次提交
-
-
由 Nicolas Pitre 提交于
The core idle loop now takes care of it. Signed-off-by: NNicolas Pitre <nico@linaro.org> Acked-by: NDaniel Lezcano <daniel.lezcano@linaro.org> Cc: Preeti U Murthy <preeti@linux.vnet.ibm.com> Cc: Paul Mundt <lethal@linux-sh.org> Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net> Cc: linux-arm-kernel@lists.infradead.org Cc: linuxppc-dev@lists.ozlabs.org Cc: linux-sh@vger.kernel.org Cc: linux-pm@vger.kernel.org Cc: Russell King <linux@arm.linux.org.uk> Cc: linaro-kernel@lists.linaro.org Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: NPeter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/n/tip-3ioazimg4j5iq6kdefks04i8@git.kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
- 25 9月, 2013 1 次提交
-
-
由 Peter Zijlstra 提交于
Mike reported that commit 7d1a9417 ("x86: Use generic idle loop") regressed several workloads and caused excessive reschedule interrupts. The patch in question failed to notice that the x86 code had an inverted sense of the polling state versus the new generic code (x86: default polling, generic: default !polling). Fix the two prominent x86 mwait based idle drivers and introduce a few new generic polling helpers (fixing the wrong smp_mb__after_clear_bit usage). Also switch the idle routines to using tif_need_resched() which is an immediate TIF_NEED_RESCHED test as opposed to need_resched which will end up being slightly different. Reported-by: NMike Galbraith <bitbucket@online.de> Signed-off-by: NPeter Zijlstra <peterz@infradead.org> Cc: lenb@kernel.org Cc: tglx@linutronix.de Link: http://lkml.kernel.org/n/tip-nc03imb0etuefmzybzj7sprf@git.kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
- 07 8月, 2013 1 次提交
-
-
由 Andi Kleen 提交于
Plus one function, load_gs_index(). Signed-off-by: NAndi Kleen <ak@linux.intel.com> Link: http://lkml.kernel.org/r/1375740170-7446-10-git-send-email-andi@firstfloor.orgSigned-off-by: NH. Peter Anvin <hpa@linux.intel.com>
-
- 15 7月, 2013 1 次提交
-
-
由 Paul Gortmaker 提交于
The __cpuinit type of throwaway sections might have made sense some time ago when RAM was more constrained, but now the savings do not offset the cost and complications. For example, the fix in commit 5e427ec2 ("x86: Fix bit corruption at CPU resume time") is a good example of the nasty type of bugs that can be created with improper use of the various __init prefixes. After a discussion on LKML[1] it was decided that cpuinit should go the way of devinit and be phased out. Once all the users are gone, we can then finally remove the macros themselves from linux/init.h. Note that some harmless section mismatch warnings may result, since notify_cpu_starting() and cpu_up() are arch independent (kernel/cpu.c) are flagged as __cpuinit -- so if we remove the __cpuinit from arch specific callers, we will also get section mismatch warnings. As an intermediate step, we intend to turn the linux/init.h cpuinit content into no-ops as early as possible, since that will get rid of these warnings. In any case, they are temporary and harmless. This removes all the arch/x86 uses of the __cpuinit macros from all C files. x86 only had the one __CPUINIT used in assembly files, and it wasn't paired off with a .previous or a __FINIT, so we can delete it directly w/o any corresponding additional change there. [1] https://lkml.org/lkml/2013/5/20/589 Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: x86@kernel.org Acked-by: NIngo Molnar <mingo@kernel.org> Acked-by: NThomas Gleixner <tglx@linutronix.de> Acked-by: NH. Peter Anvin <hpa@linux.intel.com> Signed-off-by: NPaul Gortmaker <paul.gortmaker@windriver.com>
-
- 12 6月, 2013 1 次提交
-
-
由 Thomas Gleixner 提交于
Moving x86 to the generic idle implementation (commit 7d1a9417 "x86: Use generic idle loop") wreckaged the stack protector. I stupidly missed that boot_init_stack_canary() must be inlined from a function which never returns, but I put that call into arch_cpu_idle_prepare() which of course returns. I pondered to play tricks with arch_cpu_idle_prepare() first, but then I noticed, that the other archs which have implemented the stackprotector (ARM and SH) do not initialize the canary for the non-boot cpus. So I decided to move the boot_init_stack_canary() call into cpu_startup_entry() ifdeffed with an CONFIG_X86 for now. This #ifdef is just a temporary measure as I don't want to inflict the boot_init_stack_canary() call on ARM and SH that late in the cycle. I'll queue a patch for 3.11 which removes the #ifdef if the ARM/SH maintainers have no objection. Reported-by: NWouter van Kesteren <woutershep@gmail.com> Cc: x86@kernel.org Cc: Russell King <linux@arm.linux.org.uk> Cc: Paul Mundt <lethal@linux-sh.org> Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
-
- 07 5月, 2013 1 次提交
-
-
由 Thomas Gleixner 提交于
The core code expects the arch idle code to return with interrupts enabled. The conversion missed two x86 cases which fail to do that. Reported-and-tested-by: NMarkus Trippelsdorf <markus@trippelsdorf.de> Tested-by: NBorislav Petkov <bp@alien8.de> Link: http://lkml.kernel.org/r/alpine.LFD.2.02.1305021557030.3972@ionosSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
-
- 01 5月, 2013 1 次提交
-
-
由 Tejun Heo 提交于
show_regs() is inherently arch-dependent but it does make sense to print generic debug information and some archs already do albeit in slightly different forms. This patch introduces a generic function to print debug information from show_regs() so that different archs print out the same information and it's much easier to modify what's printed. show_regs_print_info() prints out the same debug info as dump_stack() does plus task and thread_info pointers. * Archs which didn't print debug info now do. alpha, arc, blackfin, c6x, cris, frv, h8300, hexagon, ia64, m32r, metag, microblaze, mn10300, openrisc, parisc, score, sh64, sparc, um, xtensa * Already prints debug info. Replaced with show_regs_print_info(). The printed information is superset of what used to be there. arm, arm64, avr32, mips, powerpc, sh32, tile, unicore32, x86 * s390 is special in that it used to print arch-specific information along with generic debug info. Heiko and Martin think that the arch-specific extra isn't worth keeping s390 specfic implementation. Converted to use the generic version. Note that now all archs print the debug info before actual register dumps. An example BUG() dump follows. kernel BUG at /work/os/work/kernel/workqueue.c:4841! invalid opcode: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC Modules linked in: CPU: 0 PID: 1 Comm: swapper/0 Not tainted 3.9.0-rc1-work+ #7 Hardware name: empty empty/S3992, BIOS 080011 10/26/2007 task: ffff88007c85e040 ti: ffff88007c860000 task.ti: ffff88007c860000 RIP: 0010:[<ffffffff8234a07e>] [<ffffffff8234a07e>] init_workqueues+0x4/0x6 RSP: 0000:ffff88007c861ec8 EFLAGS: 00010246 RAX: ffff88007c861fd8 RBX: ffffffff824466a8 RCX: 0000000000000001 RDX: 0000000000000046 RSI: 0000000000000001 RDI: ffffffff8234a07a RBP: ffff88007c861ec8 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000001 R11: 0000000000000000 R12: ffffffff8234a07a R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 FS: 0000000000000000(0000) GS:ffff88007dc00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: ffff88015f7ff000 CR3: 00000000021f1000 CR4: 00000000000007f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Stack: ffff88007c861ef8 ffffffff81000312 ffffffff824466a8 ffff88007c85e650 0000000000000003 0000000000000000 ffff88007c861f38 ffffffff82335e5d ffff88007c862080 ffffffff8223d8c0 ffff88007c862080 ffffffff81c47760 Call Trace: [<ffffffff81000312>] do_one_initcall+0x122/0x170 [<ffffffff82335e5d>] kernel_init_freeable+0x9b/0x1c8 [<ffffffff81c47760>] ? rest_init+0x140/0x140 [<ffffffff81c4776e>] kernel_init+0xe/0xf0 [<ffffffff81c6be9c>] ret_from_fork+0x7c/0xb0 [<ffffffff81c47760>] ? rest_init+0x140/0x140 ... v2: Typo fix in x86-32. v3: CPU number dropped from show_regs_print_info() as dump_stack_print_info() has been updated to print it. s390 specific implementation dropped as requested by s390 maintainers. Signed-off-by: NTejun Heo <tj@kernel.org> Acked-by: NDavid S. Miller <davem@davemloft.net> Acked-by: NJesper Nilsson <jesper.nilsson@axis.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: Fengguang Wu <fengguang.wu@intel.com> Cc: Mike Frysinger <vapier@gentoo.org> Cc: Vineet Gupta <vgupta@synopsys.com> Cc: Sam Ravnborg <sam@ravnborg.org> Acked-by: Chris Metcalf <cmetcalf@tilera.com> [tile bits] Acked-by: Richard Kuo <rkuo@codeaurora.org> [hexagon bits] Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 08 4月, 2013 1 次提交
-
-
由 Thomas Gleixner 提交于
Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Paul McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Reviewed-by: NSrivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com> Cc: Magnus Damm <magnus.damm@gmail.com> Link: http://lkml.kernel.org/r/20130321215235.486594473@linutronix.deSigned-off-by: NThomas Gleixner <tglx@linutronix.de> Cc: x86@kernel.org
-
- 03 4月, 2013 1 次提交
-
-
由 Borislav Petkov 提交于
Convert AMD erratum 400 to the bug infrastructure. Then, retract all exports for modules since they're not needed now and make the AMD erratum checking machinery local to amd.c. Use forward declarations to avoid shuffling too much code around needlessly. Signed-off-by: NBorislav Petkov <bp@suse.de> Link: http://lkml.kernel.org/r/1363788448-31325-7-git-send-email-bp@alien8.deSigned-off-by: NH. Peter Anvin <hpa@zytor.com>
-
- 13 3月, 2013 1 次提交
-
-
由 Thomas Gleixner 提交于
Avoid going back into deep idle if the tick broadcast IPI is about to fire. Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Cc: John Stultz <john.stultz@linaro.org> Cc: Arjan van de Veen <arjan@infradead.org> Cc: x86@kernel.org Link: http://lkml.kernel.org/r/20130306111537.702278273@linutronix.de
-
- 18 2月, 2013 2 次提交
-
-
由 Len Brown 提交于
(pm_idle)() is being removed from linux/pm.h because Linux does not have such a cross-architecture concept. x86 uses an idle function pointer in its architecture specific code as a backup to cpuidle. So we re-name x86 use of pm_idle to x86_idle, and make it static to x86. Signed-off-by: NLen Brown <len.brown@intel.com> Cc: x86@kernel.org
-
由 Len Brown 提交于
Update APM to register its local idle routine with cpuidle. This allows us to stop exporting pm_idle to modules on x86. The Kconfig sub-option, APM_CPU_IDLE, now depends on on CPU_IDLE. Compile-tested only. Signed-off-by: NLen Brown <len.brown@intel.com> Reviewed-by: NDaniel Lezcano <daniel.lezcano@linaro.org> Cc: Jiri Kosina <jkosina@suse.cz>
-
- 10 2月, 2013 3 次提交
-
-
由 Len Brown 提交于
Remove 32-bit x86 a cmdline param "no-hlt", and the cpuinfo_x86.hlt_works_ok that it sets. If a user wants to avoid HLT, then "idle=poll" is much more useful, as it avoids invocation of HLT in idle, while "no-hlt" failed to do so. Indeed, hlt_works_ok was consulted in only 3 places. First, in /proc/cpuinfo where "hlt_bug yes" would be printed if and only if the user booted the system with "no-hlt" -- as there was no other code to set that flag. Second, check_hlt() would not invoke halt() if "no-hlt" were on the cmdline. Third, it was consulted in stop_this_cpu(), which is invoked by native_machine_halt()/reboot_interrupt()/smp_stop_nmi_callback() -- all cases where the machine is being shutdown/reset. The flag was not consulted in the more frequently invoked play_dead()/hlt_play_dead() used in processor offline and suspend. Since Linux-3.0 there has been a run-time notice upon "no-hlt" invocations indicating that it would be removed in 2012. Signed-off-by: NLen Brown <len.brown@intel.com> Cc: x86@kernel.org
-
由 Len Brown 提交于
mwait_idle() is a C1-only idle loop intended to be more efficient than HLT, starting on Pentium-4 HT-enabled processors. But mwait_idle() has been replaced by the more general mwait_idle_with_hints(), which handles both C1 and deeper C-states. ACPI processor_idle and intel_idle use only mwait_idle_with_hints(), and no longer use mwait_idle(). Here we simplify the x86 native idle code by removing mwait_idle(), and the "idle=mwait" bootparam used to invoke it. Since Linux 3.0 there has been a boot-time warning when "idle=mwait" was invoked saying it would be removed in 2012. This removal was also noted in the (now removed:-) feature-removal-schedule.txt. After this change, kernels configured with (CONFIG_ACPI=n && CONFIG_INTEL_IDLE=n) when run on hardware that supports MWAIT will simply use HLT. If MWAIT is desired on those systems, cpuidle and the cpuidle drivers above can be enabled. Signed-off-by: NLen Brown <len.brown@intel.com> Cc: x86@kernel.org
-
由 Len Brown 提交于
This macro is only invoked by Xen, so make its definition specific to Xen. > set_pm_idle_to_default() < xen_set_default_idle() Signed-off-by: NLen Brown <len.brown@intel.com> Cc: xen-devel@lists.xensource.com
-
- 26 1月, 2013 1 次提交
-
-
由 Paul Gortmaker 提交于
The text in Documentation said it would be removed in 2.6.41; the text in the Kconfig said removal in the 3.1 release. Either way you look at it, we are well past both, so push it off a cliff. Note that the POWER_CSTATE and the POWER_PSTATE are part of the legacy tracing API. Remove all tracepoints which use these flags. As can be seen from context, most already have a trace entry via trace_cpu_idle anyways. Also, the cpufreq/cpufreq.c PSTATE one is actually unpaired, as compared to the CSTATE ones which all have a clear start/stop. As part of this, the trace_power_frequency also becomes orphaned, so it too is deleted. Signed-off-by: NPaul Gortmaker <paul.gortmaker@windriver.com> Acked-by: NSteven Rostedt <rostedt@goodmis.org> Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
-
- 29 11月, 2012 1 次提交
-
-
由 Al Viro 提交于
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
- 26 10月, 2012 1 次提交
-
-
由 Daniel Lezcano 提交于
The hlt_use_halt function returns always true and there is only one definition of it. The default_idle function can then get ride of the if ... statement and we can remove the else branch. Signed-off-by: NDaniel Lezcano <daniel.lezcano@linaro.org> Cc: linaro-dev@lists.linaro.org Cc: patches@linaro.org Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1351181591-8710-1-git-send-email-daniel.lezcano@linaro.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
- 01 10月, 2012 2 次提交
-
-
由 Al Viro 提交于
32bit wrapper is lost on that; 64bit one is *not*, since we need to arrange for full pt_regs on stack when we call sys_execve() and we need to load callee-saved ones from there afterwards. Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
由 Al Viro 提交于
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
- 20 9月, 2012 1 次提交
-
-
由 Al Viro 提交于
TIF_NOTIFY_RESUME will work in precisely the same way; all that is achieved by TIF_IRET is appearing that there's some work to be done, so we end up on the iret exit path. Just use NOTIFY_RESUME. And for execve() do that in 32bit start_thread(), not sys_execve() itself. Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
- 19 9月, 2012 3 次提交
-
-
由 Suresh Siddha 提交于
Decouple non-lazy/eager fpu restore policy from the existence of the xsave feature. Introduce a synthetic CPUID flag to represent the eagerfpu policy. "eagerfpu=on" boot paramter will enable the policy. Requested-by: NH. Peter Anvin <hpa@zytor.com> Requested-by: NLinus Torvalds <torvalds@linux-foundation.org> Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com> Link: http://lkml.kernel.org/r/1347300665-6209-2-git-send-email-suresh.b.siddha@intel.comSigned-off-by: NH. Peter Anvin <hpa@linux.intel.com>
-
由 Suresh Siddha 提交于
Fundamental model of the current Linux kernel is to lazily init and restore FPU instead of restoring the task state during context switch. This changes that fundamental lazy model to the non-lazy model for the processors supporting xsave feature. Reasons driving this model change are: i. Newer processors support optimized state save/restore using xsaveopt and xrstor by tracking the INIT state and MODIFIED state during context-switch. This is faster than modifying the cr0.TS bit which has serializing semantics. ii. Newer glibc versions use SSE for some of the optimized copy/clear routines. With certain workloads (like boot, kernel-compilation etc), application completes its work with in the first 5 task switches, thus taking upto 5 #DNA traps with the kernel not getting a chance to apply the above mentioned pre-load heuristic. iii. Some xstate features (like AMD's LWP feature) don't honor the cr0.TS bit and thus will not work correctly in the presence of lazy restore. Non-lazy state restore is needed for enabling such features. Some data on a two socket SNB system: * Saved 20K DNA exceptions during boot on a two socket SNB system. * Saved 50K DNA exceptions during kernel-compilation workload. * Improved throughput of the AVX based checksumming function inside the kernel by ~15% as xsave/xrstor is faster than the serializing clts/stts pair. Also now kernel_fpu_begin/end() relies on the patched alternative instructions. So move check_fpu() which uses the kernel_fpu_begin/end() after alternative_instructions(). Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com> Link: http://lkml.kernel.org/r/1345842782-24175-7-git-send-email-suresh.b.siddha@intel.com Merge 32-bit boot fix from, Link: http://lkml.kernel.org/r/1347300665-6209-4-git-send-email-suresh.b.siddha@intel.com Cc: Jim Kukunas <james.t.kukunas@linux.intel.com> Cc: NeilBrown <neilb@suse.de> Cc: Avi Kivity <avi@redhat.com> Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>
-
由 Suresh Siddha 提交于
Currently for x86 and x86_32 binaries, fpstate in the user sigframe is copied to/from the fpstate in the task struct. And in the case of signal delivery for x86_64 binaries, if the fpstate is live in the CPU registers, then the live state is copied directly to the user sigframe. Otherwise fpstate in the task struct is copied to the user sigframe. During restore, fpstate in the user sigframe is restored directly to the live CPU registers. Historically, different code paths led to different bugs. For example, x86_64 code path was not preemption safe till recently. Also there is lot of code duplication for support of new features like xsave etc. Unify signal handling code paths for x86 and x86_64 kernels. New strategy is as follows: Signal delivery: Both for 32/64-bit frames, align the core math frame area to 64bytes as needed by xsave (this where the main fpu/extended state gets copied to and excludes the legacy compatibility fsave header for the 32-bit [f]xsave frames). If the state is live, copy the register state directly to the user frame. If not live, copy the state in the thread struct to the user frame. And for 32-bit [f]xsave frames, construct the fsave header separately before the actual [f]xsave area. Signal return: As the 32-bit frames with [f]xstate has an additional 'fsave' header, copy everything back from the user sigframe to the fpstate in the task structure and reconstruct the fxstate from the 'fsave' header (Also user passed pointers may not be correctly aligned for any attempt to directly restore any partial state). At the next fpstate usage, everything will be restored to the live CPU registers. For all the 64-bit frames and the 32-bit fsave frame, restore the state from the user sigframe directly to the live CPU registers. 64-bit signals always restored the math frame directly, so we can expect the math frame pointer to be correctly aligned. For 32-bit fsave frames, there are no alignment requirements, so we can restore the state directly. "lat_sig catch" microbenchmark numbers (for x86, x86_64, x86_32 binaries) are with in the noise range with this change. Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com> Link: http://lkml.kernel.org/r/1343171129-2747-4-git-send-email-suresh.b.siddha@intel.com [ Merged in compilation fix ] Link: http://lkml.kernel.org/r/1344544736.8326.17.camel@sbsiddha-desk.sc.intel.comSigned-off-by: NH. Peter Anvin <hpa@linux.intel.com>
-
- 06 6月, 2012 1 次提交
-
-
由 Joe Perches 提交于
Use a more current logging style: - Bare printks should have a KERN_<LEVEL> for consistency's sake - Add pr_fmt where appropriate - Neaten some macro definitions - Convert some Ok output to OK - Use "%s: ", __func__ in pr_fmt for summit - Convert some printks to pr_<level> Message output is not identical in all cases. Signed-off-by: NJoe Perches <joe@perches.com> Cc: levinsasha928@gmail.com Link: http://lkml.kernel.org/r/1337655007.24226.10.camel@joe2Laptop [ merged two similar patches, tidied up the changelog ] Signed-off-by: NIngo Molnar <mingo@kernel.org>
-
- 17 5月, 2012 2 次提交
-
-
由 Suresh Siddha 提交于
There is no need to save any active fpu state to the task structure memory if the task is dead. Just drop the state instead. For example, this saved some 1770 xsave's during the system boot of a two socket Xeon system. Suggested-by: NLinus Torvalds <torvalds@linux-foundation.org> Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com> Link: http://lkml.kernel.org/r/1336692811-30576-4-git-send-email-suresh.b.siddha@intel.com Cc: Oleg Nesterov <oleg@redhat.com> Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>
-
由 Suresh Siddha 提交于
Historical prepare_to_copy() is mostly a no-op, duplicated for majority of the architectures and the rest following the x86 model of flushing the extended register state like fpu there. Remove it and use the arch_dup_task_struct() instead. Suggested-by: NOleg Nesterov <oleg@redhat.com> Suggested-by: NLinus Torvalds <torvalds@linux-foundation.org> Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com> Link: http://lkml.kernel.org/r/1336692811-30576-1-git-send-email-suresh.b.siddha@intel.comAcked-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org> Cc: David Howells <dhowells@redhat.com> Cc: Koichi Yasutake <yasutake.koichi@jp.panasonic.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Paul Mundt <lethal@linux-sh.org> Cc: Chris Zankel <chris@zankel.net> Cc: Richard Henderson <rth@twiddle.net> Cc: Russell King <linux@arm.linux.org.uk> Cc: Haavard Skinnemoen <hskinnemoen@gmail.com> Cc: Mike Frysinger <vapier@gentoo.org> Cc: Mark Salter <msalter@redhat.com> Cc: Aurelien Jacquiot <a-jacquiot@ti.com> Cc: Mikael Starvik <starvik@axis.com> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Cc: Richard Kuo <rkuo@codeaurora.org> Cc: Tony Luck <tony.luck@intel.com> Cc: Michal Simek <monstr@monstr.eu> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Jonas Bonn <jonas@southpole.se> Cc: James E.J. Bottomley <jejb@parisc-linux.org> Cc: Helge Deller <deller@gmx.de> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Chen Liqin <liqin.chen@sunplusct.com> Cc: Lennox Wu <lennox.wu@gmail.com> Cc: David S. Miller <davem@davemloft.net> Cc: Chris Metcalf <cmetcalf@tilera.com> Cc: Jeff Dike <jdike@addtoit.com> Cc: Richard Weinberger <richard@nod.at> Cc: Guan Xuetao <gxt@mprc.pku.edu.cn> Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>
-
- 15 5月, 2012 1 次提交
-
-
由 Alex Shi 提交于
Since percpu_xxx() serial functions are duplicated with this_cpu_xxx(). Removing percpu_xxx() definition and replacing them by this_cpu_xxx() in code. There is no function change in this patch, just preparation for later percpu_xxx serial function removing. On x86 machine the this_cpu_xxx() serial functions are same as __this_cpu_xxx() without no unnecessary premmpt enable/disable. Thanks for Stephen Rothwell, he found and fixed a i386 build error in the patch. Also thanks for Andrew Morton, he kept updating the patchset in Linus' tree. Signed-off-by: NAlex Shi <alex.shi@intel.com> Acked-by: NChristoph Lameter <cl@gentwo.org> Acked-by: NTejun Heo <tj@kernel.org> Acked-by: N"H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: NStephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NTejun Heo <tj@kernel.org>
-
- 09 5月, 2012 1 次提交
-
-
由 Jan Beulich 提交于
What was called show_registers() so far already showed a stack trace for kernel faults, and kernel_stack_pointer() isn't even valid to be used for faults from user mode, hence it was pointless for show_regs() to call show_trace() after show_registers(). Simply rename show_registers() to show_regs() and eliminate the old definition. Signed-off-by: NJan Beulich <jbeulich@suse.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Arjan van de Ven <arjan@linux.intel.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Link: http://lkml.kernel.org/r/4FAA3D3902000078000826E1@nat28.tlf.novell.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
- 08 5月, 2012 2 次提交
-
-
由 Thomas Gleixner 提交于
The only difference is the free_thread_info function, which frees xstate. Use the new arch_release_task_struct() function instead and switch over to the core allocator. Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20120505150141.559556763@linutronix.de Cc: x86@kernel.org
-
由 Thomas Gleixner 提交于
Use kick_all_cpus_sync() and remove cpu_idle_wait(). Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20120507175652.190382227@linutronix.de Cc: x86@kernel.org
-
- 07 5月, 2012 1 次提交
-
-
由 Srivatsa S. Bhat 提交于
The checks that exist in mwait_usable() for "idle=" kernel parameters are insufficient. As a result, mwait_usable() can return 1 even if "idle=nomwait" or "idle=poll" or "idle=halt" parameters are passed. Of these cases, incorrect handling of idle=nomwait is a universal problem since mwait can get used for usual CPU idling. However the rest of the cases are problematic only during CPU Hotplug (offline) because, in the CPU offline path, the function mwait_play_dead() is called, which might result in mwait being used in the offline CPUs, if mwait_usable() happens to return 1. Fix these issues by checking for the boot time "idle=" kernel parameter properly in mwait_usable(). The first issue (usual cpu idling) is demonstrated below: Before applying the patch (dmesg snippet): [ 0.000000] Command line: [...] idle=nomwait [ 0.000000] Kernel command line: [...] idle=nomwait [ 0.000000] RCU dyntick-idle grace-period acceleration is enabled. [ 0.140606] using mwait in idle threads. <======= mwait being used [ 4.303986] cpuidle: using governor ladder [ 4.308232] cpuidle: using governor menu After applying the patch: [ 0.000000] Command line: [...] idle=nomwait [ 0.000000] Kernel command line: [...] idle=nomwait [ 0.000000] RCU dyntick-idle grace-period acceleration is enabled. [ 4.264100] cpuidle: using governor ladder [ 4.268342] cpuidle: using governor menu Signed-off-by: NSrivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com> Acked-by: NDeepthi Dharwar <deepthi@linux.vnet.ibm.com> Acked-by: NThomas Gleixner <tglx@linutronix.de> Cc: venki@google.com Cc: suresh.b.siddha@intel.com Cc: Borislav Petkov <bp@amd64.org> Cc: lenb@kernel.org Cc: Rafael J. Wysocki <rjw@sisk.pl> Link: http://lkml.kernel.org/r/4F9E37B8.30001@linux.vnet.ibm.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
- 05 5月, 2012 1 次提交
-
-
由 Thomas Gleixner 提交于
Same code. Use the generic version. The special Makefile treatment is pointless anyway as init_task.o contains only data which is handled by the linker script. So no point on being treated like head text. Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20120503085035.739963562@linutronix.de Cc: x86@kernel.org
-
- 30 3月, 2012 1 次提交
-
-
由 Len Brown 提交于
The X86_32-only disable_hlt/enable_hlt mechanism was used by the 32-bit floppy driver. Its effect was to replace the use of the HLT instruction inside default_idle() with cpu_relax() - essentially it turned off the use of HLT. This workaround was commented in the code as: "disable hlt during certain critical i/o operations" "This halt magic was a workaround for ancient floppy DMA wreckage. It should be safe to remove." H. Peter Anvin additionally adds: "To the best of my knowledge, no-hlt only existed because of flaky power distributions on 386/486 systems which were sold to run DOS. Since DOS did no power management of any kind, including HLT, the power draw was fairly uniform; when exposed to the much hhigher noise levels you got when Linux used HLT caused some of these systems to fail. They were by far in the minority even back then." Alan Cox further says: "Also for the Cyrix 5510 which tended to go castors up if a HLT occurred during a DMA cycle and on a few other boxes HLT during DMA tended to go astray. Do we care ? I doubt it. The 5510 was pretty obscure, the 5520 fixed it, the 5530 is probably the oldest still in any kind of use." So, let's finally drop this. Signed-off-by: NLen Brown <len.brown@intel.com> Signed-off-by: NJosh Boyer <jwboyer@redhat.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Acked-by: N"H. Peter Anvin" <hpa@zytor.com> Acked-by: NAlan Cox <alan@lxorguk.ukuu.org.uk> Cc: Stephen Hemminger <shemminger@vyatta.com Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: <stable@kernel.org> Link: http://lkml.kernel.org/n/tip-3rhk9bzf0x9rljkv488tloib@git.kernel.org [ If anyone cares then alternative instruction patching could be used to replace HLT with a one-byte NOP instruction. Much simpler. ] Signed-off-by: NIngo Molnar <mingo@kernel.org>
-
- 29 3月, 2012 1 次提交
-
-
由 David Howells 提交于
Disintegrate asm/system.h for X86. Signed-off-by: NDavid Howells <dhowells@redhat.com> Acked-by: NH. Peter Anvin <hpa@zytor.com> cc: x86@kernel.org
-