- 05 5月, 2010 12 次提交
-
-
由 Eric W. Biederman 提交于
ACPI irq source overrides are allowed for the 16 isa irqs and are allowed to map any gsi to any isa irq. A few motherboards have been seen to take advantage of this and put the isa irqs on the 2nd or 3rd ioapic. This causes some problems, most notably the fact that we can not use any gsi < 16. To correct this move the gsis that are not isa irqs and have a gsi number < 16 into the linux irq space just past gsi_end. This is what the es7000 platform is doing today. Moving only the low 16 gsis above the rest of the gsi's only penalizes weird platforms, leaving sane acpi implementations with a 1-1 mapping of gsis and irqs. Signed-off-by: NEric W. Biederman <ebiederm@xmission.com> LKML-Reference: <1269936436-7039-14-git-send-email-ebiederm@xmission.com> Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
-
由 Eric W. Biederman 提交于
Use the global gsi_end value now that all ioapics have valid gsi numbers instead of a combination of acpi_probe_gsi and walking all of the ioapics and couting their number of entries by hand if acpi_probe_gsi gave us an answer we did not like. Signed-off-by: NEric W. Biederman <ebiederm@xmission.com> LKML-Reference: <1269936436-7039-13-git-send-email-ebiederm@xmission.com> Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
-
由 Eric W. Biederman 提交于
Now that all ioapics have valid gsi_base values use this to accellerate pin_2_irq. In the case of acpi this also ensures that pin_2_irq will compute the same irq value for an ioapic pin as acpi will. Signed-off-by: NEric W. Biederman <ebiederm@xmission.com> LKML-Reference: <1269936436-7039-12-git-send-email-ebiederm@xmission.com> Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
-
由 Eric W. Biederman 提交于
Now that all ioapic registration happens in mp_register_ioapic we can move the calculation of nr_ioapic_registers there from enable_IO_APIC. The number of ioapic registers is already calucated in mp_register_ioapic so all that really needs to be done is to save the caluclated value in nr_ioapic_registers. Signed-off-by: NEric W. Biederman <ebiederm@xmission.com> LKML-Reference: <1269936436-7039-11-git-send-email-ebiederm@xmission.com> Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
-
由 Eric W. Biederman 提交于
Long ago MP_ioapic_info was the primary way of setting up our ioapic data structures and mp_register_ioapic was a compatibility shim for acpi code. Now the situation is reversed and and mp_register_ioapic is the primary way of setting up our ioapic data structures. Keep the setting up of ioapic data structures uniform by having mp_register_ioapic call mp_register_ioapic. This changes a few fields: - type: is now hardset to MP_IOAPIC but type had to bey MP_IOAPIC or MP_ioapic_info would not have been called. - flags: is now hard coded to MPC_APIC_USABLE. We require flags to contain at least MPC_APIC_USEBLE in MP_ioapic_info and we don't ever examine flags so dropping a few flags that might possibly exist that we have never used is harmless. - apicaddr: Unchanged - apicver: Read from the ioapic instead of using the cached hardware value in the MP table. The real hardware value will be more accurate. - apicid: Now verified to be unique and changed if it is not. If the BIOS got this right this is a noop. If the BIOS did not fixing things appears to be the better solution. This adds gsi_base and gsi_end values to our ioapics defined with the mpatable, which will make our lives simpler later since we can always assume gsi_base and gsi_end are valid. Signed-off-by: NEric W. Biederman <ebiederm@xmission.com> LKML-Reference: <1269936436-7039-10-git-send-email-ebiederm@xmission.com> Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
-
由 Eric W. Biederman 提交于
Add the global variable gsi_end and teach mp_register_ioapic to keep it uptodate as we add more ioapics into the system. ioapics can only be added early in boot so the code that runs later can treat gsi_end as a constant. Remove the have hacks in sfi.c to second guess mp_register_ioapic by keeping t's own running total of how many gsi's have been seen, and instead use the gsi_end. Signed-off-by: NEric W. Biederman <ebiederm@xmission.com> LKML-Reference: <1269936436-7039-9-git-send-email-ebiederm@xmission.com> Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
-
由 Eric W. Biederman 提交于
This patches fixes the types of gsi_base and gsi_end values in struct mp_ioapic_gsi, and the gsi parameter of mp_find_ioapic and mp_find_ioapic_pin A gsi is cannonically a u32, not an int. Signed-off-by: NEric W. Biederman <ebiederm@xmission.com> LKML-Reference: <1269936436-7039-8-git-send-email-ebiederm@xmission.com> Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
-
由 Eric W. Biederman 提交于
io_apic_redir_entries has a huge conceptual bug. It returns the maximum redirection entry not the number of redirection entries. Which simply does not match what the name of the function. This just caught me and it caught Feng Tang, and Len Brown when they wrote sfi_parse_ioapic. Modify io_apic_redir_entries to actually return the number of redirection entries, and fix the callers so that they properly handle receiving the number of the number of redirection table entries, instead of the number of redirection table entries less one. While the usage in sfi.c does not show up in this patch it is fixed by virtue of the fact that io_apic_redir_entries now has the semantics sfi_parse_ioapic most reasonably expects. Signed-off-by: NEric W. Biederman <ebiederm@xmission.com> LKML-Reference: <1269936436-7039-7-git-send-email-ebiederm@xmission.com> Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
-
由 Eric W. Biederman 提交于
Remove the assumption that there is not an override for isa irq 0. Instead lookup the gsi and from that lookup the ioapic and pin of each isa irq indivdually. In general this should not have any behavioural affect but in perverse cases this gets all of the details correct, instead of doing something weird. Signed-off-by: NEric W. Biederman <ebiederm@xmission.com> LKML-Reference: <1269936436-7039-5-git-send-email-ebiederm@xmission.com> Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
-
由 Eric W. Biederman 提交于
Currently acpi_sci_ioapic_setup calls mp_override_legacy_irq with bus_irq == gsi, which is wrong if we are comming from an override Instead pass the bus_irq into acpi_sci_ioapic_setup. This fix was inspired by a similar fix from: Yinghai Lu <yinghai@kernel.org> Signed-off-by: NEric W. Biederman <ebiederm@xmission.com> LKML-Reference: <1269936436-7039-4-git-send-email-ebiederm@xmission.com> Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
-
由 Eric W. Biederman 提交于
In perverse acpi implementations the isa irqs are not identity mapped to the first 16 gsi. Furthermore at least the extended interrupt resource capability may return gsi's and not isa irqs. So since what we get from acpi is a gsi teach acpi_get_overrride_irq to operate on a gsi instead of an isa_irq. Signed-off-by: NEric W. Biederman <ebiederm@xmission.com> LKML-Reference: <1269936436-7039-2-git-send-email-ebiederm@xmission.com> Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
-
由 Eric W. Biederman 提交于
There are a number of cases where the current code makes the assumption that isa irqs identity map to the first 16 acpi global system intereupts. In most instances that assumption is correct as that is the required behaviour in dual i8259 mode and the default behavior in ioapic mode. However there are some systems out there that take advantage of acpis interrupt remapping for the isa irqs to have a completely different mapping of isa_irq to gsi. Introduce acpi_isa_irq_to_gsi to perform this mapping explicitly in the code that needs it. Initially this will be just the current assumed identity mapping to ensure it's introduction does not cause regressions. Signed-off-by: NEric W. Biederman <ebiederm@xmission.com> LKML-Reference: <1269936436-7039-1-git-send-email-ebiederm@xmission.com> Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
-
- 25 4月, 2010 1 次提交
-
-
由 Dmitry Torokhov 提交于
This is a standalone version of VMware Balloon driver. Ballooning is a technique that allows hypervisor dynamically limit the amount of memory available to the guest (with guest cooperation). In the overcommit scenario, when hypervisor set detects that it needs to shuffle some memory, it instructs the driver to allocate certain number of pages, and the underlying memory gets returned to the hypervisor. Later hypervisor may return memory to the guest by reattaching memory to the pageframes and instructing the driver to "deflate" balloon. We are submitting a standalone driver because KVM maintainer (Avi Kivity) expressed opinion (rightly) that our transport does not fit well into virtqueue paradigm and thus it does not make much sense to integrate with virtio. There were also some concerns whether current ballooning technique is the right thing. If there appears a better framework to achieve this we are prepared to evaluate and switch to using it, but in the meantime we'd like to get this driver upstream. We want to get the driver accepted in distributions so that users do not have to deal with an out-of-tree module and many distributions have "upstream first" requirement. The driver has been shipping for a number of years and users running on VMware platform will have it installed as part of VMware Tools even if it will not come from a distribution, thus there should not be additional risk in pulling the driver into mainline. The driver will only activate if host is VMware so everyone else should not be affected at all. Signed-off-by: NDmitry Torokhov <dtor@vmware.com> Cc: Avi Kivity <avi@redhat.com> Cc: Jeremy Fitzhardinge <jeremy@goop.org> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 24 4月, 2010 2 次提交
-
-
由 H. Peter Anvin 提交于
Atom erratum AAE44/AAF40/AAG38/AAH41: "If software clears the PS (page size) bit in a present PDE (page directory entry), that will cause linear addresses mapped through this PDE to use 4-KByte pages instead of using a large page after old TLB entries are invalidated. Due to this erratum, if a code fetch uses this PDE before the TLB entry for the large page is invalidated then it may fetch from a different physical address than specified by either the old large page translation or the new 4-KByte page translation. This erratum may also cause speculative code fetches from incorrect addresses." [http://download.intel.com/design/processor/specupdt/319536.pdf] Where as commit 211b3d03 seems to workaround errata AAH41 (mixed 4K TLBs) it reduces the window of opportunity for the bug to occur and does not totally remove it. This patch disables mixed 4K/4MB page tables totally avoiding the page splitting and not tripping this processor issue. This is based on an original patch by Colin King. Originally-by: NColin Ian King <colin.king@canonical.com> Cc: Colin Ian King <colin.king@canonical.com> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com> LKML-Reference: <1269271251-19775-1-git-send-email-colin.king@canonical.com> Cc: <stable@kernel.org>
-
由 H. Peter Anvin 提交于
When we do a thread switch, we clear the outgoing FS/GS base if the corresponding selector is nonzero. This is taken by __switch_to() as an entry invariant; it does not verify that it is true on entry. However, copy_thread() doesn't enforce this constraint, which can result in inconsistent results after fork(). Make copy_thread() match the behavior of __switch_to(). Reported-and-tested-by: NSamuel Thibault <samuel.thibault@inria.fr> Signed-off-by: NH. Peter Anvin <hpa@zytor.com> LKML-Reference: <4BD1E061.8030605@zytor.com> Cc: <stable@kernel.org>
-
- 21 4月, 2010 1 次提交
-
-
由 Jacob Pan 提交于
APB timer is used on Moorestown platforms but not on a standard PC. If APB timer code is compiled in but not initialized at run-time due to lack of FW reported SFI table, kernel would panic when the non-boot CPUs are offlined and notifier is called. https://bugzilla.kernel.org/show_bug.cgi?id=15786 This patch ensures CPU hotplug notifier for APB timer is only registered when the APBT timer block is initialized. Signed-off-by: NJacob Pan <jacob.jun.pan@linux.intel.com> LKML-Reference: <1271701423-1162-1-git-send-email-jacob.jun.pan@linux.intel.com> Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
-
- 09 4月, 2010 1 次提交
-
-
由 Frederic Weisbecker 提交于
When we fetch the hot regs and rewind to the nth caller, it might happen that we dereference a frame pointer outside the kernel stack boundaries, like in this example: perf_trace_sched_switch+0xd5/0x120 schedule+0x6b5/0x860 retint_careful+0xd/0x21 Since we directly dereference a userspace frame pointer here while rewinding behind retint_careful, this may end up in a crash. Fix this by simply using probe_kernel_address() when we rewind the frame pointer. This issue will have a much more proper fix in the next version of the perf_arch_fetch_caller_regs() API that will only need to rewind to the first caller. Reported-by: NEric Dumazet <eric.dumazet@gmail.com> Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com> Tested-by: NEric Dumazet <eric.dumazet@gmail.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Paul Mackerras <paulus@samba.org> Cc: David Miller <davem@davemloft.net> Cc: Archs <linux-arch@vger.kernel.org>
-
- 07 4月, 2010 5 次提交
-
-
由 Joerg Roedel 提交于
If we boot into a crash-kernel the gart might still be enabled and its caches might be dirty. This can result in undefined behavior later. Fix it by explicitly disabling the gart hardware before initialization and flushing the caches after enablement. Signed-off-by: NJoerg Roedel <joerg.roedel@amd.com>
-
由 Chris Wright 提交于
Replace open coded version with for_each_pci_dev Signed-off-by: NChris Wright <chrisw@sous-sol.org> Signed-off-by: NJoerg Roedel <joerg.roedel@amd.com>
-
由 Chris Wright 提交于
This effectively reverts commit 61d047be. Disabling the IOMMU can potetially allow DMA transactions to complete without being translated. Leave it enabled, and allow crash kernel to do the IOMMU reinitialization properly. Cc: stable@kernel.org Cc: Joerg Roedel <joerg.roedel@amd.com> Cc: Eric Biederman <ebiederm@xmission.com> Cc: Neil Horman <nhorman@tuxdriver.com> Cc: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: NChris Wright <chrisw@sous-sol.org> Signed-off-by: NJoerg Roedel <joerg.roedel@amd.com>
-
由 Chris Wright 提交于
To catch future potential issues we can add a warning whenever we issue a command before the command buffer is fully initialized. Signed-off-by: NChris Wright <chrisw@sous-sol.org> Signed-off-by: NJoerg Roedel <joerg.roedel@amd.com>
-
由 Chris Wright 提交于
Hit another kdump problem as reported by Neil Horman. When initializaing the IOMMU, we attach devices to their domains before the IOMMU is fully (re)initialized. Attaching a device will issue some important invalidations. In the context of the newly kexec'd kdump kernel, the IOMMU may have stale cached data from the original kernel. Because we do the attach too early, the invalidation commands are placed in the new command buffer before the IOMMU is updated w/ that buffer. This leaves the stale entries in the kdump context and can renders device unusable. Simply enable the IOMMU before we do the attach. Cc: stable@kernel.org Cc: Neil Horman <nhorman@tuxdriver.com> Cc: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: NChris Wright <chrisw@sous-sol.org> Signed-off-by: NJoerg Roedel <joerg.roedel@amd.com>
-
- 06 4月, 2010 1 次提交
-
-
由 Vince Weaver 提交于
According to Intel Software Devel Manual Volume 3B, the Nehalem-EX PMU is just like regular Nehalem (except for the uncore support, which is completely different). Signed-off-by: NVince Weaver <vweaver1@eecs.utk.edu> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Arjan van de Ven <arjan@linux.intel.com> Cc: Lin Ming <ming.m.lin@intel.com> LKML-Reference: <alpine.DEB.2.00.1004060956580.1417@cl320.eecs.utk.edu> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 03 4月, 2010 4 次提交
-
-
由 Suresh Siddha 提交于
Jan Grossmann reported kernel boot panic while booting SMP kernel on his system with a single core cpu. SMP kernels call enable_IR_x2apic() from native_smp_prepare_cpus() and on platforms where the kernel doesn't find SMP configuration we ended up again calling enable_IR_x2apic() from the APIC_init_uniprocessor() call in the smp_sanity_check(). Thus leading to kernel panic. Don't call enable_IR_x2apic() and default_setup_apic_routing() from APIC_init_uniprocessor() in CONFIG_SMP case. NOTE: this kind of non-idempotent and assymetric initialization sequence is rather fragile and unclean, we'll clean that up in v2.6.35. This is the minimal fix for v2.6.34. Reported-by: Jan.Grossmann@kielnet.net Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com> Cc: <jbarnes@virtuousgeek.org> Cc: <david.woodhouse@intel.com> Cc: <weidong.han@intel.com> Cc: <youquan.song@intel.com> Cc: <Jan.Grossmann@kielnet.net> Cc: <stable@kernel.org> # [v2.6.32.x, v2.6.33.x] LKML-Reference: <1270083887.7835.78.camel@sbs-t61.sc.intel.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Torok Edwin 提交于
When profiling a 32-bit process on a 64-bit kernel, callgraph tracing stopped after the first function, because it has seen a garbage memory address (tried to interpret the frame pointer, and return address as a 64-bit pointer). Fix this by using a struct stack_frame with 32-bit pointers when the TIF_IA32 flag is set. Note that TIF_IA32 flag must be used, and not is_compat_task(), because the latter is only set when the 32-bit process is executing a syscall, which may not always be the case (when tracing page fault events for example). Signed-off-by: NTörök Edwin <edwintorok@gmail.com> Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Acked-by: NFrederic Weisbecker <fweisbec@gmail.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Paul Mackerras <paulus@samba.org> Cc: x86@kernel.org Cc: linux-kernel@vger.kernel.org LKML-Reference: <1268820436-13145-1-git-send-email-edwintorok@gmail.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Peter Zijlstra 提交于
Commit 3f6da390 ("perf: Rework and fix the arch CPU-hotplug hooks") moved the amd northbridge allocation from CPUS_ONLINE to CPUS_PREPARE_UP however amd_nb_id() doesn't work yet on prepare so it would simply bail basically reverting to a state where we do not properly track node wide constraints - causing weird perf results. Fix up the AMD NorthBridge initialization code by allocating from CPU_UP_PREPARE and installing it from CPU_STARTING once we have the proper nb_id. It also properly deals with the allocation failing. Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> [ robustify using amd_has_nb() ] Signed-off-by: NStephane Eranian <eranian@google.com> LKML-Reference: <1269353485.5109.48.camel@twins> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Peter Zijlstra 提交于
Because we need to have cpu identification things done by the time we run CPU_STARTING notifiers. ( This init ordering will be relied on by the next fix. ) Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <1269353485.5109.48.camel@twins> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 02 4月, 2010 4 次提交
-
-
由 Yinghai Lu 提交于
This allows arch code could decide the way to reserve the ibft. And we should reserve ibft as early as possible, instead of BOOTMEM stage, in case the table is in RAM range and is not reserved by BIOS (this will often be the case.) Move to just after find_smp_config(). Also when CONFIG_NO_BOOTMEM=y, We will not have reserve_bootmem() anymore. -v2: fix typo about ibft pointed by Konrad Rzeszutek Wilk <konrad@darnok.org> Signed-off-by: NYinghai Lu <yinghai@kernel.org> LKML-Reference: <4BB510FB.80601@kernel.org> Cc: Pekka Enberg <penberg@cs.helsinki.fi> Cc: Peter Jones <pjones@redhat.com> Cc: Konrad Rzeszutek Wilk <konrad@kernel.org> CC: Jan Beulich <jbeulich@novell.com> Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
-
由 Alok Kataria 提交于
We think there exists a bug in the HPET code that emulates the RTC. In the normal case, when the RTC frequency is set, the rtc driver tells the hpet code about it here: int hpet_set_periodic_freq(unsigned long freq) { uint64_t clc; if (!is_hpet_enabled()) return 0; if (freq <= DEFAULT_RTC_INT_FREQ) hpet_pie_limit = DEFAULT_RTC_INT_FREQ / freq; else { clc = (uint64_t) hpet_clockevent.mult * NSEC_PER_SEC; do_div(clc, freq); clc >>= hpet_clockevent.shift; hpet_pie_delta = (unsigned long) clc; } return 1; } If freq is set to 64Hz (DEFAULT_RTC_INT_FREQ) or lower, then hpet_pie_limit (a static) is set to non-zero. Then, on every one-shot HPET interrupt, hpet_rtc_timer_reinit is called to compute the next timeout. Well, that function has this logic: if (!(hpet_rtc_flags & RTC_PIE) || hpet_pie_limit) delta = hpet_default_delta; else delta = hpet_pie_delta; Since hpet_pie_limit is not 0, hpet_default_delta is used. That corresponds to 64Hz. Now, if you set a different rtc frequency, you'll take the else path through hpet_set_periodic_freq, but unfortunately no one resets hpet_pie_limit back to 0. Boom....now you are stuck with 64Hz RTC interrupts forever. The patch below just resets the hpet_pie_limit value when requested freq is greater than DEFAULT_RTC_INT_FREQ, which we think fixes this problem. Signed-off-by: NAlok N Kataria <akataria@vmware.com> LKML-Reference: <201003112200.o2BM0Hre012875@imap1.linux-foundation.org> Signed-off-by: NDaniel Hecht <dhecht@vmware.com> Cc: Venkatesh Pallipadi <venkatesh.pallipadi@gmail.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
-
由 Pallipadi, Venkatesh 提交于
On Wed, Feb 24, 2010 at 03:37:04PM -0800, Justin Piszcz wrote: > Hello, > > Again, on the Intel DP55KG board: > > # uname -a > Linux host 2.6.33 #1 SMP Wed Feb 24 18:31:00 EST 2010 x86_64 GNU/Linux > > [ 1.237600] ------------[ cut here ]------------ > [ 1.237890] WARNING: at arch/x86/kernel/hpet.c:404 hpet_next_event+0x70/0x80() > [ 1.238221] Hardware name: > [ 1.238504] hpet: compare register read back failed. > [ 1.238793] Modules linked in: > [ 1.239315] Pid: 0, comm: swapper Not tainted 2.6.33 #1 > [ 1.239605] Call Trace: > [ 1.239886] <IRQ> [<ffffffff81056c13>] ? warn_slowpath_common+0x73/0xb0 > [ 1.240409] [<ffffffff81079608>] ? tick_dev_program_event+0x38/0xc0 > [ 1.240699] [<ffffffff81056cb0>] ? warn_slowpath_fmt+0x40/0x50 > [ 1.240992] [<ffffffff81079608>] ? tick_dev_program_event+0x38/0xc0 > [ 1.241281] [<ffffffff81041ad0>] ? hpet_next_event+0x70/0x80 > [ 1.241573] [<ffffffff81079608>] ? tick_dev_program_event+0x38/0xc0 > [ 1.241859] [<ffffffff81078e32>] ? tick_handle_oneshot_broadcast+0xe2/0x100 > [ 1.246533] [<ffffffff8102a67a>] ? timer_interrupt+0x1a/0x30 > [ 1.246826] [<ffffffff81085499>] ? handle_IRQ_event+0x39/0xd0 > [ 1.247118] [<ffffffff81087368>] ? handle_edge_irq+0xb8/0x160 > [ 1.247407] [<ffffffff81029f55>] ? handle_irq+0x15/0x20 > [ 1.247689] [<ffffffff810294a2>] ? do_IRQ+0x62/0xe0 > [ 1.247976] [<ffffffff8146be53>] ? ret_from_intr+0x0/0xa > [ 1.248262] <EOI> [<ffffffff8102f277>] ? mwait_idle+0x57/0x80 > [ 1.248796] [<ffffffff8102645c>] ? cpu_idle+0x5c/0xb0 > [ 1.249080] ---[ end trace db7f668fb6fef4e1 ]--- > > Is this something Intel has to fix or is it a bug in the kernel? This is a chipset erratum. Thomas: You mentioned we can retain this check only for known-buggy and hpet debug kind of options. But here is the simple workaround patch for this particular erratum. Some chipsets have a erratum due to which read immediately following a write of HPET comparator returns old comparator value instead of most recently written value. Erratum 15 in "Intel I/O Controller Hub 9 (ICH9) Family Specification Update" (http://www.intel.com/assets/pdf/specupdate/316973.pdf) Workaround for the errata is to read the comparator twice if the first one fails. Signed-off-by: NVenkatesh Pallipadi <venkatesh.pallipadi@intel.com> LKML-Reference: <20100225185348.GA9674@linux-os.sc.intel.com> Signed-off-by: NH. Peter Anvin <hpa@zytor.com> Cc: Venkatesh Pallipadi <venkatesh.pallipadi@gmail.com> Cc: <stable@kernel.org>
-
由 Andi Kleen 提交于
We found a system where the MP table MPC and MPF structures overlap. That doesn't really matter because the mptable is not used anyways with ACPI, but it leads to a panic in the early allocator due to the overlapping reservations in 2.6.33. Earlier kernels handled this without problems. Simply change these reservations to reserve_early_overlap_ok to avoid the panic. Reported-by: NThomas Renninger <trenn@suse.de> Tested-by: NThomas Renninger <trenn@suse.de> Signed-off-by: NAndi Kleen <ak@linux.intel.com> LKML-Reference: <20100329074111.GA22821@basil.fritz.box> Signed-off-by: NH. Peter Anvin <hpa@zytor.com> Cc: <stable@kernel.org>
-
- 01 4月, 2010 3 次提交
-
-
由 Jason Wessel 提交于
It is required to call hw_breakpoint_init() on an attr before using it in any other calls. This fixes the problem where kgdb will sometimes fail to initialize on x86_64. Signed-off-by: NJason Wessel <jason.wessel@windriver.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: 2.6.33 <stable@kernel.org> LKML-Reference: <1269975907-27602-1-git-send-email-jason.wessel@windriver.com> Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
-
由 Frederic Weisbecker 提交于
Scheduler's task migration events don't work because they always pass NULL regs perf_sw_event(). The event hence gets filtered in perf_swevent_add(). Scheduler's context switches events use task_pt_regs() to get the context when the event occured which is a wrong thing to do as this won't give us the place in the kernel where we went to sleep but the place where we left userspace. The result is even more wrong if we switch from a kernel thread. Use the hot regs snapshot for both events as they belong to the non-interrupt/exception based events family. Unlike page faults or so that provide the regs matching the exact origin of the event, we need to save the current context. This makes the task migration event working and fix the context switch callchains and origin ip. Example: perf record -a -e cs Before: 10.91% ksoftirqd/0 0 [k] 0000000000000000 | --- (nil) perf_callchain perf_prepare_sample __perf_event_overflow perf_swevent_overflow perf_swevent_add perf_swevent_ctx_event do_perf_sw_event __perf_sw_event perf_event_task_sched_out schedule run_ksoftirqd kthread kernel_thread_helper After: 23.77% hald-addon-stor [kernel.kallsyms] [k] schedule | --- schedule | |--60.00%-- schedule_timeout | wait_for_common | wait_for_completion | blk_execute_rq | scsi_execute | scsi_execute_req | sr_test_unit_ready | | | |--66.67%-- sr_media_change | | media_changed | | cdrom_media_changed | | sr_block_media_changed | | check_disk_change | | cdrom_open v2: Always build perf_arch_fetch_caller_regs() now that software events need that too. They don't need it from modules, unlike trace events, so we keep the EXPORT_SYMBOL in trace_event_perf.c Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Ingo Molnar <mingo@elte.hu> Cc: David Miller <davem@davemloft.net>
-
由 Yinghai Lu 提交于
Rusty found on lguest with trim_bios_range, max_pfn is not right anymore, and looks e820_remove_range does not work right. [ 0.000000] BIOS-provided physical RAM map: [ 0.000000] LGUEST: 0000000000000000 - 0000000004000000 (usable) [ 0.000000] Notice: NX (Execute Disable) protection missing in CPU or disabled in BIOS! [ 0.000000] DMI not present or invalid. [ 0.000000] last_pfn = 0x3fa0 max_arch_pfn = 0x100000 [ 0.000000] init_memory_mapping: 0000000000000000-0000000003fa0000 root cause is: the e820_remove_range doesn't handle the all covered case. e820_remove_range(BIOS_START, BIOS_END - BIOS_START, ...) produces a bogus range as a result. Make it match e820_update_range() by handling that case too. Reported-by: NRusty Russell <rusty@rustcorp.com.au> Signed-off-by: NYinghai Lu <yinghai@kernel.org> Tested-by: NRusty Russell <rusty@rustcorp.com.au> LKML-Reference: <4BB18E55.6090903@kernel.org> Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
-
- 30 3月, 2010 3 次提交
-
-
由 Tejun Heo 提交于
include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h percpu.h is included by sched.h and module.h and thus ends up being included when building most .c files. percpu.h includes slab.h which in turn includes gfp.h making everything defined by the two files universally available and complicating inclusion dependencies. percpu.h -> slab.h dependency is about to be removed. Prepare for this change by updating users of gfp and slab facilities include those headers directly instead of assuming availability. As this conversion needs to touch large number of source files, the following script is used as the basis of conversion. http://userweb.kernel.org/~tj/misc/slabh-sweep.py The script does the followings. * Scan files for gfp and slab usages and update includes such that only the necessary includes are there. ie. if only gfp is used, gfp.h, if slab is used, slab.h. * When the script inserts a new include, it looks at the include blocks and try to put the new include such that its order conforms to its surrounding. It's put in the include block which contains core kernel includes, in the same order that the rest are ordered - alphabetical, Christmas tree, rev-Xmas-tree or at the end if there doesn't seem to be any matching order. * If the script can't find a place to put a new include (mostly because the file doesn't have fitting include block), it prints out an error message indicating which .h file needs to be added to the file. The conversion was done in the following steps. 1. The initial automatic conversion of all .c files updated slightly over 4000 files, deleting around 700 includes and adding ~480 gfp.h and ~3000 slab.h inclusions. The script emitted errors for ~400 files. 2. Each error was manually checked. Some didn't need the inclusion, some needed manual addition while adding it to implementation .h or embedding .c file was more appropriate for others. This step added inclusions to around 150 files. 3. The script was run again and the output was compared to the edits from #2 to make sure no file was left behind. 4. Several build tests were done and a couple of problems were fixed. e.g. lib/decompress_*.c used malloc/free() wrappers around slab APIs requiring slab.h to be added manually. 5. The script was run on all .h files but without automatically editing them as sprinkling gfp.h and slab.h inclusions around .h files could easily lead to inclusion dependency hell. Most gfp.h inclusion directives were ignored as stuff from gfp.h was usually wildly available and often used in preprocessor macros. Each slab.h inclusion directive was examined and added manually as necessary. 6. percpu.h was updated not to include slab.h. 7. Build test were done on the following configurations and failures were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my distributed build env didn't work with gcov compiles) and a few more options had to be turned off depending on archs to make things build (like ipr on powerpc/64 which failed due to missing writeq). * x86 and x86_64 UP and SMP allmodconfig and a custom test config. * powerpc and powerpc64 SMP allmodconfig * sparc and sparc64 SMP allmodconfig * ia64 SMP allmodconfig * s390 SMP allmodconfig * alpha SMP allmodconfig * um on x86_64 SMP allmodconfig 8. percpu.h modifications were reverted so that it could be applied as a separate patch and serve as bisection point. Given the fact that I had only a couple of failures from tests on step 6, I'm fairly confident about the coverage of this conversion patch. If there is a breakage, it's likely to be something in one of the arch headers which should be easily discoverable easily on most builds of the specific arch. Signed-off-by: NTejun Heo <tj@kernel.org> Guess-its-ok-by: NChristoph Lameter <cl@linux-foundation.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
-
由 Yinghai Lu 提交于
When CONFIG_NO_BOOTMEM=y, it could use memory more effiently, or in a more compact fashion. Example: Allocated new RAMDISK: 00ec2000 - 0248ce57 Move RAMDISK from 000000002ea04000 - 000000002ffcee56 to 00ec2000 - 0248ce56 The new RAMDISK's end is not page aligned. Last page could be shared with other users. When free_init_pages are called for initrd or .init, the page could be freed and we could corrupt other data. code segment in free_init_pages(): | for (; addr < end; addr += PAGE_SIZE) { | ClearPageReserved(virt_to_page(addr)); | init_page_count(virt_to_page(addr)); | memset((void *)(addr & ~(PAGE_SIZE-1)), | POISON_FREE_INITMEM, PAGE_SIZE); | free_page(addr); | totalram_pages++; | } last half page could be used as one whole free page. So page align the boundaries. -v2: make the original initramdisk to be aligned, according to Johannes, otherwise we have the chance to lose one page. we still need to keep initrd_end not aligned, otherwise it could confuse decompressor. -v3: change to WARN_ON instead, suggested by Johannes. -v4: use PAGE_ALIGN, suggested by Johannes. We may fix that macro name later to PAGE_ALIGN_UP, and PAGE_ALIGN_DOWN Add comments about assuming ramdisk start is aligned in relocate_initrd(), change to re get ramdisk_image instead of save it to make diff smaller. Add warning for wrong range, suggested by Johannes. -v6: remove one WARN() We need to align beginning in free_init_pages() do not copy more than ramdisk_size, noticed by Johannes Reported-by: NStanislaw Gruszka <sgruszka@redhat.com> Tested-by: NStanislaw Gruszka <sgruszka@redhat.com> Signed-off-by: NYinghai Lu <yinghai@kernel.org> Acked-by: NJohannes Weiner <hannes@cmpxchg.org> Cc: David Miller <davem@davemloft.net> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> LKML-Reference: <1269830604-26214-3-git-send-email-yinghai@kernel.org> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Yinghai Lu 提交于
Fix: ------------[ cut here ]------------ WARNING: at arch/x86/mm/init.c:342 free_init_pages+0x4c/0xfa() free_init_pages: range [0x40daf000, 0x40db5c24] is not aligned Modules linked in: Pid: 0, comm: swapper Not tainted 2.6.34-rc2-tip-03946-g4f16b23-dirty #50 Call Trace: [<40232e9f>] warn_slowpath_common+0x65/0x7c [<4021c9f0>] ? free_init_pages+0x4c/0xfa [<40881434>] ? _etext+0x0/0x24 [<40232eea>] warn_slowpath_fmt+0x24/0x27 [<4021c9f0>] free_init_pages+0x4c/0xfa [<40881434>] ? _etext+0x0/0x24 [<40d3f4bd>] alternative_instructions+0xf6/0x100 [<40d3fe4f>] check_bugs+0xbd/0xbf [<40d398a7>] start_kernel+0x2d5/0x2e4 [<40d390ce>] i386_start_kernel+0xce/0xd5 ---[ end trace 4eaa2a86a8e2da22 ]--- Comments in vmlinux.lds.S already said: | /* | * smp_locks might be freed after init | * start/end must be page aligned | */ Signed-off-by: NYinghai Lu <yinghai@kernel.org> Acked-by: NJohannes Weiner <hannes@cmpxchg.org> Cc: David Miller <davem@davemloft.net> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> LKML-Reference: <1269830604-26214-2-git-send-email-yinghai@kernel.org> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 23 3月, 2010 1 次提交
-
-
由 Rafael J. Wysocki 提交于
Commit 3f6da390 (perf: Rework and fix the arch CPU-hotplug hooks) broke suspend to RAM on my HP nx6325 (and most likely on other AMD-based boxes too) by allowing amd_pmu_cpu_offline() to be executed for CPUs that are going offline as part of the suspend process. The problem is that cpuhw->amd_nb may be NULL already, so the function should make sure it's not NULL before accessing the object pointed to by it. Signed-off-by: NRafael J. Wysocki <rjw@sisk.pl> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 20 3月, 2010 1 次提交
-
-
由 Andreas Herrmann 提交于
Currently c1e_idle returns true for all CPUs greater than or equal to family 0xf model 0x40. This covers too many CPUs. Meanwhile a respective erratum for the underlying problem was filed (#400). This patch adds the logic to check whether erratum #400 applies to a given CPU. Especially for CPUs where SMI/HW triggered C1e is not supported, c1e_idle() doesn't need to be used. We can check this by looking at the respective OSVW bit for erratum #400. Cc: <stable@kernel.org> # .32.x .33.x Signed-off-by: NAndreas Herrmann <andreas.herrmann3@amd.com> LKML-Reference: <20100319110922.GA19614@alberich.amd.com> Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
-
- 17 3月, 2010 1 次提交
-
-
由 Frederic Weisbecker 提交于
perf_arch_fetch_caller_regs() is exported for the overriden x86 version, but not for the generic weak version. As a general rule, weak functions should not have their symbol exported in the same file they are defined. So let's export it on trace_event_perf.c as it is used by trace events only. This fixes: ERROR: ".perf_arch_fetch_caller_regs" [fs/xfs/xfs.ko] undefined! ERROR: ".perf_arch_fetch_caller_regs" [arch/powerpc/platforms/cell/spufs/spufs.ko] undefined! -v2: And also only build it if trace events are enabled. -v3: Fix changelog mistake Reported-by: NStephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Cc: Paul Mackerras <paulus@samba.org> LKML-Reference: <1268697902-9518-1-git-send-regression-fweisbec@gmail.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-