- 26 9月, 2006 40 次提交
-
-
由 Dmitriy Zavin 提交于
Refactor the event processing (syslog messaging and rate limiting) into separate file therm_throt.c. This allows consistent reporting of CPU thermal throttle events. After ACK'ing the interrupt, if the event is current, the user (p4.c/mce_intel.c) calls therm_throt_process to log (and rate limit) the event. If that function returns 1, the user has the option to log things further (such as to mce_log in x86_64). AK: minor cleanup Signed-off-by: NDmitriy Zavin <dmitriyz@google.com> Signed-off-by: NAndi Kleen <ak@suse.de>
-
由 Andi Kleen 提交于
Some buggy systems can machine check when config space accesses happen for some non existent devices. i386/x86-64 do some early device scans that might trigger this. Allow pci=noearly to disable this. Also when type 1 is disabling also don't do any early accesses which are always type1. This moves the pci= configuration parsing to be a early parameter. I don't think this can break anything because it only changes a single global that is only used by PCI. Cc: gregkh@suse.de Cc: Trammell Hudson <hudson@osresearch.net> Signed-off-by: NAndi Kleen <ak@suse.de>
-
由 Andi Kleen 提交于
This is useful on systems with broken PCI bus. Affects various scans in x86-64 and i386's early ACPI quirk scan. Cc: gregkh@suse.de Cc: len.brown@intel.com Cc: Trammell Hudson <hudson@osresearch.net> Signed-off-by: NAndi Kleen <ak@suse.de>
-
由 Andi Kleen 提交于
SYSENTER can cause a NT to be set which might cause crashes on the IRET in the next task. Following similar i386 patch from Linus. Signed-off-by: NAndi Kleen <ak@suse.de>
-
由 Jan Beulich 提交于
Current gcc generates calls not jumps to noreturn functions. When that happens the return address can point to the next function, which confuses the unwinder. This patch works around it by marking asynchronous exception frames in contrast normal call frames in the unwind information. Then teach the unwinder to decode this. For normal call frames the unwinder now subtracts one from the address which avoids this problem. The standard libgcc unwinder uses the same trick. It doesn't include adjustment of the printed address (i.e. for the original example, it'd still be kernel_math_error+0 that gets displayed, but the unwinder wouldn't get confused anymore. This only works with binutils 2.6.17+ and some versions of H.J.Lu's 2.6.16 unfortunately because earlier binutils don't support .cfi_signal_frame [AK: added automatic detection of the new binutils and wrote description] Signed-off-by: NJan Beulich <jbeulich@novell.com> Signed-off-by: NAndi Kleen <ak@suse.de>
-
由 Andi Kleen 提交于
This was old code that was needed for iBCS and x86-64 never supported that. Pointed out by Albert Cahalan Signed-off-by: NAndi Kleen <ak@suse.de>
-
由 Andi Kleen 提交于
We do some additional CPU synchronization in gettimeofday et.al. to make sure the time stamps are always monotonic over multiple CPUs. But on single core systems that is not needed. So don't do it. Signed-off-by: NAndi Kleen <ak@suse.de>
-
由 Andi Kleen 提交于
It is faster than using a unrolled loop for the use cases the kernel cares about (cached, sizes typically < 4K) Signed-off-by: NAndi Kleen <ak@suse.de>
-
由 Matthew Garrett 提交于
Got it. i8259A_resume calls init_8259A(0) unconditionally, even if auto_eoi has been set. Keep track of the current status and restore that on resume. This fixes it for AMD64 and i386. Signed-off-by: NMatthew Garrett <mjg59@srcf.ucam.org> Signed-off-by: NAndi Kleen <ak@suse.de>
-
由 Aaron Durbin 提交于
Patch inserts the GART region into the iomem resource map. The GART will then be visible within /proc/iomem. It will also allow for other users utilizing the GART to subreserve the region (agp or IOMMU). Signed-off-by: NAaron Durbin <adurbin@google.com> Signed-off-by: NAndi Kleen <ak@suse.de>
-
由 Andi Kleen 提交于
Previously exit_idle would be called more often than enter_idle Now instead of using complicated tests just keep track of it using the per CPU variable as a flip flop. I moved the idle state into the PDA to make the access more efficient. Original bug report and an initial patch from Stephane Eranian, but redone by AK. Cc: Stephane Eranian <eranian@hpl.hp.com> Signed-off-by: NAndi Kleen <ak@suse.de>
-
由 Andi Kleen 提交于
Before 2.6.16 this was changed to work around code that accessed CPUs not in the possible map. But that code should be all fixed now, so mark it __initdata again. Signed-off-by: NAndi Kleen <ak@suse.de>
-
由 Andi Kleen 提交于
- Don't zero for __copy_from_user_inatomic following i386. This will prevent spurious zeros for parallel file system writers when one does a exception - The string instruction version didn't zero the output on exception. Oops. Also I cleaned up the code a bit while I was at it and added a minor optimization to the string instruction path. Signed-off-by: NAndi Kleen <ak@suse.de>
-
由 adurbin@google.com 提交于
This patch places the IOAPIC(s) and the Local APIC specified by ACPI tables into the resource map. The APICs will then be visible within /proc/iomem Signed-off-by: NAaron Durbin <adurbin@google.com> Signed-off-by: NAndi Kleen <ak@suse.de>
-
由 Andi Kleen 提交于
Fix linux/arch/x86_64/kernel/process.c: In function __switch_to: linux/arch/x86_64/kernel/process.c:626: warning: assignment makes integer from pointer without a cast Signed-off-by: NAndi Kleen <ak@suse.de>
-
由 Andi Kleen 提交于
Signed-off-by: NAndi Kleen <ak@suse.de>
-
由 Arjan van de Ven 提交于
This patch adds the per thread cookie field to the task struct and the PDA. Also it makes sure that the PDA value gets the new cookie value at context switch, and that a new task gets a new cookie at task creation time. Signed-off-by: NArjan van Ven <arjan@linux.intel.com> Signed-off-by: NIngo Molnar <mingo@elte.hu> Signed-off-by: NAndi Kleen <ak@suse.de> CC: Andi Kleen <ak@suse.de>
-
由 Andi Kleen 提交于
Because it can take spinlocks. Suggested by Mathieu Desnoyers Cc: Mathieu Desnoyers <compudj@krystal.dyndns.org> Signed-off-by: NAndi Kleen <ak@suse.de>
-
由 Magnus Damm 提交于
kexec: Avoid overwriting the current pgd (V4, x86_64) This patch upgrades the x86_64-specific kexec code to avoid overwriting the current pgd. Overwriting the current pgd is bad when CONFIG_CRASH_DUMP is used to start a secondary kernel that dumps the memory of the previous kernel. The code introduces a new set of page tables. These tables are used to provide an executable identity mapping without overwriting the current pgd. Signed-off-by: NMagnus Damm <magnus@valinux.co.jp> Signed-off-by: NAndi Kleen <ak@suse.de>
-
由 Keith Owens 提交于
Remove most of the special cases for the debug IST stack. This is a follow on clean up patch, it requires the bug fix patch that adds orig_ist. Signed-off-by: NKeith Owens <kaos@ocs.com.au> Signed-off-by: NAndi Kleen <ak@suse.de>
-
由 Andi Kleen 提交于
Based on a idea by Jeremy Fitzhardinge: Replace the volatiles and memory clobbers in the PDA access with telling gcc about access to a proxy PDA structure that doesn't actually exist. But the dummy accesses give a defined ordering for read/write accesses. Also add some memory barriers to the early GS initialization to make sure no PDA access is moved before it. Advantage is some .text savings (probably most from better code for accessing "current"): text data bss dec hex filename 4845647 1223688 615864 6685199 66020f vmlinux 4837780 1223688 615864 6677332 65e354 vmlinux-pda 1.2% smaller code Cc: Jeremy Fitzhardinge <jeremy@goop.org> Signed-off-by: NAndi Kleen <ak@suse.de>
-
由 Ian Campbell 提交于
This patch updates x86_64 linker script to pack any .note.* sections into a PT_NOTE segment in the output file. To do this, we tell ld that we need a PT_NOTE segment. This requires us to start explicitly mapping sections to segments, so we also need to explicitly create PT_LOAD segments for text and data, and map the sections to them appropriately. Fortunately, each section will default to its previous section's segment, so it doesn't take many changes to vmlinux.lds.S. The corresponding change is already made for i386 in -mm and I'd like this patch to join it. The section to segment mappings do change as do the segment flags so some time in -mm would be good for that reason as well, just in case. In particular .data and .bss move from the text segment to the data segment and .data.cacheline_aligned .data.read_mostly are put in the data segment instead of a separate one. I think that it would be possible to exactly match the existing section to segment mapping and flags but it would be a more intrusive change and I'm not sure there is a reason for the existing layout other than it is what you get by default if you don't explicitly specify something else. If there is a reason for the existing layout then I will of course make the more intrusive change. If there is no reason we could probably drop the executable or writable flags from some segments but I don't know how much attention is paid to them anyway so it might not be worth the effort. The vsyscall related sections need to go in a different segment to the normal data segment and so I invented a "user" segment to contain them. I believe this should appear to be another data segment as far as the kernel is concerned so the flags are setup accordingly. The notes will be used in the Xen paravirt_ops backend to provide additional information to the domain builder. I am in the process of converting the xen-unstable kernels and tools over to this scheme at the moment to support this in the future. It has been suggested to me that the notes segment should have flags 0 (i.e. not readable) since it is only used by the loader and is not used at runtime. For now I went with a readable segment since that is what the i386 patch uses. AK: dropped NOTES addition right now because the needed infrastructure for that is not merged yet Signed-off-by: NIan Campbell <ian.campbell@xensource.com> Signed-off-by: NAndi Kleen <ak@suse.de>
-
由 Eric W. Biederman 提交于
In long mode the %cs is largely a relic. However there are a few cases like iret where it matters that we have a valid value. Without this patch it is possible to enter the kernel in startup_64 without setting %cs to a valid value. With this patch we don't care what %cs value we enter the kernel with, so long as the cs shadow register indicates it is a privileged code segment. Thanks to Magnus Damm for finding this problem and posting the first workable patch. I have moved the jump to set %cs down a few instructions so we don't need to take an extra jump. Which keeps the code simpler. Signed-of-by: NEric W. Biederman <ebiederm@xmission.com> Signed-off-by: NAndi Kleen <ak@suse.de>
-
由 Andi Kleen 提交于
Drop support for non e820 BIOS calls to get the memory map. The boot assembler code still has some support, but not the C code now. Signed-off-by: NAndi Kleen <ak@suse.de>
-
由 Andi Kleen 提交于
NMIs are not supposed to track the irq flags, but TRACE_IRQS_IRETQ did it anyways. Add a check. Cc: mingo@elte.hu Signed-off-by: NAndi Kleen <ak@suse.de>
-
由 Andi Kleen 提交于
Give the printks a consistent prefix. Add some missing white space. Cc: len.brown@intel.com Signed-off-by: NAndi Kleen <ak@suse.de>
-
由 Andi Kleen 提交于
- Remove a define that was used only once - Remove the too large APIC ID check because we always support the full 8bit range of APICs. - Restructure code a bit to be simpler. Cc: len.brown@intel.com Signed-off-by: NAndi Kleen <ak@suse.de>
-
由 Andi Kleen 提交于
ACPI went to great trouble to get the APIC version and CPU capabilities of different CPUs before passing them to the mpparser. But all that data was used was to print it out. Actually it even faked some data based on the boot cpu, not on the actual CPU being booted. Remove all this code because it's not needed. Cc: len.brown@intel.com Signed-off-by: NAndi Kleen <ak@suse.de>
-
由 Andi Kleen 提交于
And replace all users with ordinary smp_processor_id. The function was originally added to get some basic oops information out even if the GS register was corrupted. However that didn't work for some anymore because printk is needed to print the oops and it uses smp_processor_id() already. Also GS register corruptions are not particularly common anymore. This also helps the Xen port which would otherwise need to do this in a special way because it can't access the local APIC. Cc: Chris Wright <chrisw@sous-sol.org> Signed-off-by: NAndi Kleen <ak@suse.de>
-
由 Rafael J. Wysocki 提交于
Detect the situations in which the time after a resume from disk would be earlier than the time before the suspend and prevent them from happening on x86_64. Signed-off-by: NRafael J. Wysocki <rjw@sisk.pl> Signed-off-by: NAndi Kleen <ak@suse.de>
-
由 Andi Kleen 提交于
From i386 x86-64 inherited code to force reserve the 640k-1MB area. That was needed on some old systems. But we generally trust the e820 map to be correct on 64bit systems and mark all areas that are not memory correctly. This patch will allow to use the real memory in there. Or rather the only way to find out if it's still needed is to try. So far I'm optimistic. Signed-off-by: NAndi Kleen <ak@suse.de>
-
由 Magnus Damm 提交于
The init_amd() function is only called from identify_cpu() which is already marked as __cpuinit. So let's mark it as __cpuinit. Signed-off-by: NMagnus Damm <magnus@valinux.co.jp> Signed-off-by: NAndi Kleen <ak@suse.de>
-
由 Andrew Morton 提交于
Implement pause_on_oops() on x86_64. AK: I redid the patch to do the oops_enter/exit in the existing oops_begin()/end(). This makes it much shorter. Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NAndi Kleen <ak@suse.de>
-
由 Arjan van de Ven 提交于
Right now the kernel on x86-64 has a 100% lazy fpu behavior: after *every* context switch a trap is taken for the first FPU use to restore the FPU context lazily. This is of course great for applications that have very sporadic or no FPU use (since then you avoid doing the expensive save/restore all the time). However for very frequent FPU users... you take an extra trap every context switch. The patch below adds a simple heuristic to this code: After 5 consecutive context switches of FPU use, the lazy behavior is disabled and the context gets restored every context switch. If the app indeed uses the FPU, the trap is avoided. (the chance of the 6th time slice using FPU after the previous 5 having done so are quite high obviously). After 256 switches, this is reset and lazy behavior is returned (until there are 5 consecutive ones again). The reason for this is to give apps that do longer bursts of FPU use still the lazy behavior back after some time. [akpm@osdl.org: place new task_struct field next to jit_keyring to save space] Signed-off-by: NArjan van de Ven <arjan@linux.intel.com> Signed-off-by: NAndi Kleen <ak@suse.de> Cc: Andi Kleen <ak@muc.de> Signed-off-by: NAndrew Morton <akpm@osdl.org>
-
由 Eric W. Biederman 提交于
Now for a completely different but trivial approach. I just boot tested it with 255 CPUS and everything worked. Currently everything (except module data) we place in the per cpu area we know about at compile time. So instead of allocating a fixed size for the per_cpu area allocate the number of bytes we need plus a fixed constant for to be used for modules. It isn't perfect but it is much less of a pain to work with than what we are doing now. AK: fixed warning Signed-off-by: NEric W. Biederman <ebiederm@xmission.com> Signed-off-by: NAndi Kleen <ak@suse.de>
-
由 Andi Kleen 提交于
Since it's all zero. Actually I think gcc 4+ will do that automatically, but earlier compilers won't Signed-off-by: NAndi Kleen <ak@suse.de>
-
由 Dimitri Sivanich 提交于
I've noticed some erratic behavior while testing the X86_64 version of monotonic_clock(). While spinning in a loop reading monotonic clock values (pinned to a single cpu) I noticed that the difference between subsequent values occasionally went negative (time going backwards). I found that in the following code: this_offset = get_cycles_sync(); /* FIXME: 1000 or 1000000? */ --> offset = (this_offset - last_offset)*1000 / cpu_khz; } return base + offset; the offset sometimes turns out to be 0, even though this_offset > last_offset. +Added fix From: Toyo Abe <toyoa@mvista.com> The x86_64-mm-monotonic-clock.patch in 2.6.18-rc4-mm2 made a change to the updating of monotonic_base. It now uses cycles_2_ns(). I suggest that a set_cyc2ns_scale() should be done prior to the setup_irq(). Because cycles_2_ns() can be called from the timer ISR right after the irq0 is enabled. Signed-off-by: NToyo Abe <toyoa@mvista.com> Signed-off-by: NDimitri Sivanich <sivanich@sgi.com> Signed-off-by: NAndi Kleen <ak@suse.de>
-
由 Prasanna S.P 提交于
This patch moves the entry.S:error_entry to .kprobes.text section, since code marked unsafe for kprobes jumps directly to entry.S::error_entry, that must be marked unsafe as well. This patch also moves all the ".previous.text" asm directives to ".previous" for kprobes section. AK: Following a similar i386 patch from Chuck Ebbert AK: Also merged Jeremy's fix in. +From: Jeremy Fitzhardinge <jeremy@goop.org> KPROBE_ENTRY does a .section .kprobes.text, and expects its users to do a .previous at the end of the function. Unfortunately, if any code within the function switches sections, for example .fixup, then the .previous ends up putting all subsequent code into .fixup. Worse, any subsequent .fixup code gets intermingled with the code its supposed to be fixing (which is also in .fixup). It's surprising this didn't cause more havok. The fix is to use .pushsection/.popsection, so this stuff nests properly. A further cleanup would be to get rid of all .section/.previous pairs, since they're inherently fragile. +From: Chuck Ebbert <76306.1226@compuserve.com> Because code marked unsafe for kprobes jumps directly to entry.S::error_code, that must be marked unsafe as well. The easiest way to do that is to move the page fault entry point to just before error_code and let it inherit the same section. Also moved all the ".previous" asm directives for kprobes sections to column 1 and removed ".text" from them. Signed-off-by: NChuck Ebbert <76306.1226@compuserve.com> Signed-off-by: NAndi Kleen <ak@suse.de>
-
由 Andi Kleen 提交于
Signed-off-by: NAndi Kleen <ak@suse.de>
-
由 Andi Kleen 提交于
This unifies the standard backtracer and the new stacktrace in memory backtracer. The standard one is converted to use callbacks and then reimplement stacktrace using new callbacks. The main advantage is that stacktrace can now use the new dwarf2 unwinder and avoid false positives in many cases. I kept it simple to make sure the standard backtracer stays reliable. Cc: mingo@elte.hu Signed-off-by: NAndi Kleen <ak@suse.de>
-