- 14 3月, 2012 1 次提交
-
-
由 Tang Liang 提交于
The ACPI suspend path makes a call to tboot_sleep right before it writes the PM1A, PM1B values. We replace the direct call to tboot via an registration callback similar to __acpi_register_gsi. CC: Len Brown <len.brown@intel.com> Acked-by: NJoseph Cihula <joseph.cihula@intel.com> Acked-by: NRafael J. Wysocki <rjw@sisk.pl> [v1: Added __attribute__ ((unused))] [v2: Introduced a wrapper instead of changing tboot_sleep return values] [v3: Added return value AE_CTRL_SKIP for acpi_os_sleep_prepare] Signed-off-by: NTang Liang <liang.tang@oracle.com> [v1: Fix compile issues on IA64 and PPC64] [v2: Fix where __acpi_os_prepare_sleep==NULL and did not go in sleep properly] Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
-
- 26 12月, 2011 1 次提交
-
-
由 Jan Kiszka 提交于
Unlike all of the other cpuid bits, the TSC deadline timer bit is set unconditionally, regardless of what userspace wants. This is broken in several ways: - if userspace doesn't use KVM_CREATE_IRQCHIP, and doesn't emulate the TSC deadline timer feature, a guest that uses the feature will break - live migration to older host kernels that don't support the TSC deadline timer will cause the feature to be pulled from under the guest's feet; breaking it - guests that are broken wrt the feature will fail. Fix by not enabling the feature automatically; instead report it to userspace. Because the feature depends on KVM_CREATE_IRQCHIP, which we cannot guarantee will be called, we expose it via a KVM_CAP_TSC_DEADLINE_TIMER and not KVM_GET_SUPPORTED_CPUID. Fixes the Illumos guest kernel, which uses the TSC deadline timer feature. [avi: add the KVM_CAP + documentation] Reported-by: NAlexey Zaytsev <alexey.zaytsev@gmail.com> Tested-by: NAlexey Zaytsev <alexey.zaytsev@gmail.com> Signed-off-by: NJan Kiszka <jan.kiszka@siemens.com> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
- 25 12月, 2011 1 次提交
-
-
由 Jan Kiszka 提交于
User space may create the PIT and forgets about setting up the irqchips. In that case, firing PIT IRQs will crash the host: BUG: unable to handle kernel NULL pointer dereference at 0000000000000128 IP: [<ffffffffa10f6280>] kvm_set_irq+0x30/0x170 [kvm] ... Call Trace: [<ffffffffa11228c1>] pit_do_work+0x51/0xd0 [kvm] [<ffffffff81071431>] process_one_work+0x111/0x4d0 [<ffffffff81071bb2>] worker_thread+0x152/0x340 [<ffffffff81075c8e>] kthread+0x7e/0x90 [<ffffffff815a4474>] kernel_thread_helper+0x4/0x10 Prevent this by checking the irqchip mode before starting a timer. We can't deny creating the PIT if the irqchips aren't set up yet as current user land expects this order to work. Signed-off-by: NJan Kiszka <jan.kiszka@siemens.com> Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
-
- 24 12月, 2011 1 次提交
-
-
由 Robert Richter 提交于
Use raw_spin_unlock_irqrestore() as equivalent to raw_spin_lock_irqsave(). Signed-off-by: NRobert Richter <robert.richter@amd.com> Cc: Stephane Eranian <eranian@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1324646665-13334-1-git-send-email-robert.richter@amd.comSigned-off-by: NIngo Molnar <mingo@elte.hu>
-
- 20 12月, 2011 2 次提交
-
-
由 Clemens Ladisch 提交于
When printing the code bytes in show_registers(), the markers around the byte at the fault address could make the printk() format string look like a valid log level and facility code. This would prevent this byte from being printed and result in a spurious newline: [ 7555.765589] Code: 8b 32 e9 94 00 00 00 81 7d 00 ff 00 00 00 0f 87 96 00 00 00 48 8b 83 c0 00 00 00 44 89 e2 44 89 e6 48 89 df 48 8b 80 d8 02 00 00 [ 7555.765683] 8b 48 28 48 89 d0 81 e2 ff 0f 00 00 48 c1 e8 0c 48 c1 e0 04 Add KERN_CONT where needed, and elsewhere in show_registers() for consistency. Signed-off-by: NClemens Ladisch <clemens@ladisch.de> Link: http://lkml.kernel.org/r/4EEFA7AE.9020407@ladisch.deSigned-off-by: NH. Peter Anvin <hpa@linux.intel.com>
-
由 Markus Kötter 提交于
x86 jump instruction size is 2 or 5 bytes (near/long jump), not 2 or 6 bytes. In case a conditional jump is followed by a long jump, conditional jump target is one byte past the start of target instruction. Signed-off-by: NMarkus Kötter <nepenthesdev@gmail.com> Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 16 12月, 2011 1 次提交
-
-
由 Ian Campbell 提交于
d312ae87 "xen: use maximum reservation to limit amount of usable RAM" clamped the total amount of RAM to the current maximum reservation. This is correct for dom0 but is not correct for guest domains. In order to boot a guest "pre-ballooned" (e.g. with memory=1G but maxmem=2G) in order to allow for future memory expansion the guest must derive max_pfn from the e820 provided by the toolstack and not the current maximum reservation (which can reflect only the current maximum, not the guest lifetime max). The existing algorithm already behaves this correctly if we do not artificially limit the maximum number of pages for the guest case. For a guest booted with maxmem=512, memory=128 this results in: [ 0.000000] BIOS-provided physical RAM map: [ 0.000000] Xen: 0000000000000000 - 00000000000a0000 (usable) [ 0.000000] Xen: 00000000000a0000 - 0000000000100000 (reserved) -[ 0.000000] Xen: 0000000000100000 - 0000000008100000 (usable) -[ 0.000000] Xen: 0000000008100000 - 0000000020800000 (unusable) +[ 0.000000] Xen: 0000000000100000 - 0000000020800000 (usable) ... [ 0.000000] NX (Execute Disable) protection: active [ 0.000000] DMI not present or invalid. [ 0.000000] e820 update range: 0000000000000000 - 0000000000010000 (usable) ==> (reserved) [ 0.000000] e820 remove range: 00000000000a0000 - 0000000000100000 (usable) -[ 0.000000] last_pfn = 0x8100 max_arch_pfn = 0x1000000 +[ 0.000000] last_pfn = 0x20800 max_arch_pfn = 0x1000000 [ 0.000000] initial memory mapped : 0 - 027ff000 [ 0.000000] Base memory trampoline at [c009f000] 9f000 size 4096 -[ 0.000000] init_memory_mapping: 0000000000000000-0000000008100000 -[ 0.000000] 0000000000 - 0008100000 page 4k -[ 0.000000] kernel direct mapping tables up to 8100000 @ 27bb000-27ff000 +[ 0.000000] init_memory_mapping: 0000000000000000-0000000020800000 +[ 0.000000] 0000000000 - 0020800000 page 4k +[ 0.000000] kernel direct mapping tables up to 20800000 @ 26f8000-27ff000 [ 0.000000] xen: setting RW the range 27e8000 - 27ff000 [ 0.000000] 0MB HIGHMEM available. -[ 0.000000] 129MB LOWMEM available. -[ 0.000000] mapped low ram: 0 - 08100000 -[ 0.000000] low ram: 0 - 08100000 +[ 0.000000] 520MB LOWMEM available. +[ 0.000000] mapped low ram: 0 - 20800000 +[ 0.000000] low ram: 0 - 20800000 With this change "xl mem-set <domain> 512M" will successfully increase the guest RAM (by reducing the balloon). There is no change for dom0. Reported-and-Tested-by: NGeorge Shuklin <george.shuklin@gmail.com> Signed-off-by: NIan Campbell <ian.campbell@citrix.com> Cc: stable@kernel.org Reviewed-by: NDavid Vrabel <david.vrabel@citrix.com> Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
-
- 13 12月, 2011 1 次提交
-
-
由 Keith Packard 提交于
This hangs my MacBook Air at boot time; I get no console messages at all. I reverted this on top of -rc5 and my machine boots again. This reverts commit e8c71062. Signed-off-by: NMatt Fleming <matt.fleming@intel.com> Signed-off-by: NKeith Packard <keithp@keithp.com> Acked-by: NH. Peter Anvin <hpa@zytor.com> Cc: Matthew Garrett <mjg@redhat.com> Cc: Zhang Rui <rui.zhang@intel.com> Cc: Huang Ying <huang.ying.caritas@gmail.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> Link: http://lkml.kernel.org/r/1321621751-3650-1-git-send-email-matt@consoleSigned-off-by: NIngo Molnar <mingo@elte.hu>
-
- 10 12月, 2011 1 次提交
-
-
由 Matt Fleming 提交于
efi_call_phys_prelog() sets up a 1:1 mapping of the physical address range in swapper_pg_dir. Instead of replacing then restoring entries in swapper_pg_dir we should be using initial_page_table which already contains the 1:1 mapping. It's safe to blindly switch back to swapper_pg_dir in the epilog because the physical EFI routines are only called before efi_enter_virtual_mode(), e.g. before any user processes have been forked. Therefore, we don't need to track which pgd was in %cr3 when we entered the prelog. The previous code actually contained a bug because it assumed that the kernel was loaded at a physical address within the first 8MB of ram, usually at 0x100000. However, this isn't the case with a CONFIG_RELOCATABLE=y kernel which could have been loaded anywhere in the physical address space. Also delete the ancient (and bogus) comments about the page table being restored after the lock is released. There is no locking. Cc: Matthew Garrett <mjg@redhat.com> Cc: Darrent Hart <dvhart@linux.intel.com> Signed-off-by: NMatt Fleming <matt.fleming@intel.com> Link: http://lkml.kernel.org/r/1323346250.3894.74.camel@mfleming-mobl1.ger.corp.intel.comSigned-off-by: NH. Peter Anvin <hpa@linux.intel.com>
-
- 09 12月, 2011 3 次提交
-
-
由 Youquan Song 提交于
With the 3.2-rc kernel, IOMMU 2M pages in KVM works. But when I tried to use IOMMU 1GB pages in KVM, I encountered an oops and the 1GB page failed to be used. The root cause is that 1GB page allocation calls gup_huge_pud() while 2M page calls gup_huge_pmd. If compound pages are used and the page is a tail page, gup_huge_pmd() increases _mapcount to record tail page are mapped while gup_huge_pud does not do that. So when the mapped page is relesed, it will result in kernel oops because the page is not marked mapped. This patch add tail process for compound page in 1GB huge page which keeps the same process as 2M page. Reproduce like: 1. Add grub boot option: hugepagesz=1G hugepages=8 2. mount -t hugetlbfs -o pagesize=1G hugetlbfs /dev/hugepages 3. qemu-kvm -m 2048 -hda os-kvm.img -cpu kvm64 -smp 4 -mem-path /dev/hugepages -net none -device pci-assign,host=07:00.1 kernel BUG at mm/swap.c:114! invalid opcode: 0000 [#1] SMP Call Trace: put_page+0x15/0x37 kvm_release_pfn_clean+0x31/0x36 kvm_iommu_put_pages+0x94/0xb1 kvm_iommu_unmap_memslots+0x80/0xb6 kvm_assign_device+0xba/0x117 kvm_vm_ioctl_assigned_device+0x301/0xa47 kvm_vm_ioctl+0x36c/0x3a2 do_vfs_ioctl+0x49e/0x4e4 sys_ioctl+0x5a/0x7c system_call_fastpath+0x16/0x1b RIP put_compound_page+0xd4/0x168 Signed-off-by: NYouquan Song <youquan.song@intel.com> Reviewed-by: NAndrea Arcangeli <aarcange@redhat.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: <stable@vger.kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Matt Fleming 提交于
If we encounter an efi_memory_desc_t without EFI_MEMORY_WB set in ->attribute we currently call set_memory_uc(), which in turn calls __pa() on a potentially ioremap'd address. On CONFIG_X86_32 this is invalid, resulting in the following oops on some machines: BUG: unable to handle kernel paging request at f7f22280 IP: [<c10257b9>] reserve_ram_pages_type+0x89/0x210 [...] Call Trace: [<c104f8ca>] ? page_is_ram+0x1a/0x40 [<c1025aff>] reserve_memtype+0xdf/0x2f0 [<c1024dc9>] set_memory_uc+0x49/0xa0 [<c19334d0>] efi_enter_virtual_mode+0x1c2/0x3aa [<c19216d4>] start_kernel+0x291/0x2f2 [<c19211c7>] ? loglevel+0x1b/0x1b [<c19210bf>] i386_start_kernel+0xbf/0xc8 A better approach to this problem is to map the memory region with the correct attributes from the start, instead of modifying it after the fact. The uncached case can be handled by ioremap_nocache() and the cached by ioremap_cache(). Despite first impressions, it's not possible to use ioremap_cache() to map all cached memory regions on CONFIG_X86_64 because EFI_RUNTIME_SERVICES_DATA regions really don't like being mapped into the vmalloc space, as detailed in the following bug report, https://bugzilla.redhat.com/show_bug.cgi?id=748516 Therefore, we need to ensure that any EFI_RUNTIME_SERVICES_DATA regions are covered by the direct kernel mapping table on CONFIG_X86_64. To accomplish this we now map E820_RESERVED_EFI regions via the direct kernel mapping with the initial call to init_memory_mapping() in setup_arch(), whereas previously these regions wouldn't be mapped if they were after the last E820_RAM region until efi_ioremap() was called. Doing it this way allows us to delete efi_ioremap() completely. Signed-off-by: NMatt Fleming <matt.fleming@intel.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Matthew Garrett <mjg@redhat.com> Cc: Zhang Rui <rui.zhang@intel.com> Cc: Huang Ying <huang.ying.caritas@gmail.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> Link: http://lkml.kernel.org/r/1321621751-3650-1-git-send-email-matt@console-pimps.orgSigned-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Mark Langsdorf 提交于
When HPET is operating in RTC mode, the TN_ENABLE bit on timer1 controls whether the HPET or the RTC delivers interrupts to irq8. When the system goes into suspend, the RTC driver sends a signal to the HPET driver so that the HPET releases control of irq8, allowing the RTC to wake the system from suspend. The switchover is accomplished by a write to the HPET configuration registers which currently only occurs while servicing the HPET interrupt. On some systems, I have seen the system suspend before an HPET interrupt occurs, preventing the write to the HPET configuration register and leaving the HPET in control of the irq8. As the HPET is not active during suspend, it does not generate a wake signal and RTC alarms do not work. This patch forces the HPET driver to immediately transfer control of the irq8 channel to the RTC instead of waiting until the next interrupt event. Signed-off-by: NMark Langsdorf <mark.langsdorf@amd.com> Link: http://lkml.kernel.org/r/20111118153306.GB16319@alberich.amd.comTested-by: NAndreas Herrmann <andreas.herrmann3@amd.com> Signed-off-by: NAndreas Herrmann <andreas.herrmann3@amd.com> Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Cc: stable@vger.kernel.org
-
- 06 12月, 2011 5 次提交
-
-
由 Alan Cox 提交于
If we select a symbol it should have a type declared first otherwise in some situations the config tools get upset. They are currently perhaps a bit too resilient which is why this wasn't noticed initially. Signed-off-by: NAlan Cox <alan@linux.intel.com> Link: http://lkml.kernel.org/r/20111206132811.4041.32549.stgit@bob.linux.org.ukSigned-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Alan Cox 提交于
We currently fail to build on CONFIG_X86_INTEL_MID=y and CONFIG_X86_MRST unset. We could build all the bits to make generic MID work if you picked MID platform alone but that's really silly. Instead use select and two variables. This looks a bit daft right now but once we add a Medfield selection it'll start to look a good deal more sensible. Reported-by: NIngo Molnar <mingo@elte.hu> Reported-by: NStanislaw Gruszka <sgruszka@redhat.com> Signed-off-by: NAlan Cox <alan@linux.intel.com> Link: http://lkml.kernel.org/r/20111205231433.28811.51297.stgit@bob.linux.org.ukSigned-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Andreas Herrmann 提交于
I've received complaints that the numa_node attribute for family 15h model 00-0fh (e.g. Interlagos) northbridge functions shows -1 instead of the proper node ID. Correct this with attached quirks (similar to quirks for other AMD CPU families used in multi-socket systems). Signed-off-by: NAndreas Herrmann <andreas.herrmann3@amd.com> Cc: Frank Arnold <frank.arnold@amd.com> Cc: Borislav Petkov <borislav.petkov@amd.com> Link: http://lkml.kernel.org/r/20111202072143.GA31916@alberich.amd.comSigned-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Mathias Nyman 提交于
Intel MID x86 platforms have a memory mapped virtual RTC instead. No MID platform have the default ports (and accessing them may do weird stuff). Signed-off-by: NMathias Nyman <mathias.nyman@linux.intel.com> Signed-off-by: NAlan Cox <alan@linux.intel.com> Cc: feng.tang@intel.com Cc: Feng Tang <feng.tang@intel.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Konrad Rzeszutek Wilk 提交于
Fix an outstanding issue that has been reported since 2.6.37. Under a heavy loaded machine processing "fork()" calls could crash with: BUG: unable to handle kernel paging request at f573fc8c IP: [<c01abc54>] swap_count_continued+0x104/0x180 *pdpt = 000000002a3b9027 *pde = 0000000001bed067 *pte = 0000000000000000 Oops: 0000 [#1] SMP Modules linked in: Pid: 1638, comm: apache2 Not tainted 3.0.4-linode37 #1 EIP: 0061:[<c01abc54>] EFLAGS: 00210246 CPU: 3 EIP is at swap_count_continued+0x104/0x180 .. snip.. Call Trace: [<c01ac222>] ? __swap_duplicate+0xc2/0x160 [<c01040f7>] ? pte_mfn_to_pfn+0x87/0xe0 [<c01ac2e4>] ? swap_duplicate+0x14/0x40 [<c01a0a6b>] ? copy_pte_range+0x45b/0x500 [<c01a0ca5>] ? copy_page_range+0x195/0x200 [<c01328c6>] ? dup_mmap+0x1c6/0x2c0 [<c0132cf8>] ? dup_mm+0xa8/0x130 [<c013376a>] ? copy_process+0x98a/0xb30 [<c013395f>] ? do_fork+0x4f/0x280 [<c01573b3>] ? getnstimeofday+0x43/0x100 [<c010f770>] ? sys_clone+0x30/0x40 [<c06c048d>] ? ptregs_clone+0x15/0x48 [<c06bfb71>] ? syscall_call+0x7/0xb The problem is that in copy_page_range() we turn lazy mode on, and then in swap_entry_free() we call swap_count_continued() which ends up in: map = kmap_atomic(page, KM_USER0) + offset; and then later we touch *map. Since we are running in batched mode (lazy) we don't actually set up the PTE mappings and the kmap_atomic is not done synchronously and ends up trying to dereference a page that has not been set. Looking at kmap_atomic_prot_pfn(), it uses 'arch_flush_lazy_mmu_mode' and doing the same in kmap_atomic_prot() and __kunmap_atomic() makes the problem go away. Interestingly, commit b8bcfe99 ("x86/paravirt: remove lazy mode in interrupts") removed part of this to fix an interrupt issue - but it went to far and did not consider this scenario. Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Cc: <stable@kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 05 12月, 2011 15 次提交
-
-
由 Peter Chubb 提交于
Looks like on some Acer Aspire 1s with older bioses, reboot via bios fails. It works on my machine, (with BIOS version 0.3310) but not on some others (BIOS version 0.3309). There's a log of problems at: https://bbs.archlinux.org/viewtopic.php?id=124136 This patch adds a different callback to the reboot quirk table, to allow rebooting via keybaord controller. Reported-by: NUroš Vampl <mobile.leecher@gmail.com> Tested-by: NVasily Khoruzhick <anarsoul@gmail.com> Signed-off-by: NPeter Chubb <peter.chubb@nicta.com.au> Cc: Don Zickus <dzickus@redhat.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: stable@kernel.org Link: http://lkml.kernel.org/r/1323093233-9481-1-git-send-email-anarsoul@gmail.comSigned-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Ajaykumar Hotchandani 提交于
Following is from Notes of section 11.5.3 of Intel processor manual available at: http://www.intel.com/Assets/PDF/manual/325384.pdf For the Pentium 4 and Intel Xeon processors, after the sequence of steps given above has been executed, the cache lines containing the code between the end of the WBINVD instruction and before the MTRRS have actually been disabled may be retained in the cache hierarchy. Here, to remove code from the cache completely, a second WBINVD instruction must be executed after the MTRRs have been disabled. This patch provides resolution for that. Ideally, I will like to make changes only for Pentium 4 and Xeon processors. But, I am not finding easier way to do it. And, extra wbinvd() instruction does not hurt much for other processors. Signed-off-by: NAjaykumar Hotchandani <ajaykumar.hotchandani@oracle.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Arjan van de Ven <arjan@linux.intel.com> Cc: Lucas De Marchi <lucas.demarchi@profusion.mobi> Link: http://lkml.kernel.org/r/4EBD1CC5.3030008@oracle.comSigned-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Borislav Petkov 提交于
Recently, I got bitten by using rdmsr_safe too early in the boot process. Document its shortcomings for future reference. Link: http://lkml.kernel.org/r/4ED5B70F.606@lwfinger.netSigned-off-by: NBorislav Petkov <borislav.petkov@amd.com>
-
由 Srivatsa S. Bhat 提交于
The microcode update driver's initialization code does not handle failures correctly. This patch fixes this issue. Signed-off-by: NJan Beulich <JBeulich@suse.com> Signed-off-by: NSrivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com> Link: http://lkml.kernel.org/r/20111107123530.12164.31227.stgit@srivatsabhat.in.ibm.com Link: http://lkml.kernel.org/r/4ED8E2270200007800065120@nat28.tlf.novell.comSigned-off-by: NBorislav Petkov <borislav.petkov@amd.com>
-
由 Prarit Bhargava 提交于
TAINT_FIRMWARE_WORKAROUND should be set when an MTRR fixup is done. Signed-off-by: NPrarit Bhargava <prarit@redhat.com> Acked-by: NDavid Rientjes <rientjes@google.com> Link: http://lkml.kernel.org/r/1318958650-12447-1-git-send-email-prarit@redhat.comSigned-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Bjorn Helgaas 提交于
In commit f8924e77 ("x86: unify mp_bus_info"), the 32-bit and 64-bit versions of MP_bus_info were rearranged to match each other better. Unfortunately it introduced a regression: prior to that change we used to always set the mp_bus_not_pci bit, then clear it if we found a PCI bus. After it, we set mp_bus_not_pci for ISA buses, clear it for PCI buses, and leave it alone otherwise. In the cases of ISA and PCI, there's not much difference. But ISA is not the only non-PCI bus, so it's better to always set mp_bus_not_pci and clear it only for PCI. Without this change, Dan's Dell PowerEdge 4200 panics on boot with a log indicating interrupt routing trouble unless the "noapic" option is supplied. With this change, the machine boots reliably without "noapic". Fixes http://bugs.debian.org/586494Reported-bisected-and-tested-by: NDan McGrath <troubledaemon@gmail.com> Signed-off-by: NBjorn Helgaas <bhelgaas@google.com> Cc: stable@vger.kernel.org # 2.6.26+ Cc: Dan McGrath <troubledaemon@gmail.com> Cc: Alexey Starikovskiy <aystarik@gmail.com> [jrnieder@gmail.com: clarified commit message] Signed-off-by: NJonathan Nieder <jrnieder@gmail.com> Link: http://lkml.kernel.org/r/20111122215000.GA9151@elie.hsd1.il.comcast.netSigned-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Feng Tang 提交于
In latest firmware's SFI tables, pmic_gpio has been set to IPC type of device, so we need handle it too. Signed-off-by: NFeng Tang <feng.tang@intel.com> Signed-off-by: NAlan Cox <alan@linux.intel.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Jekyll Lai 提交于
Add SFI glue for the following devices: tca6416: a gpio expander compatible with max7315 mpu3050: gyro sensor Both of these actual drivers are already upstream Signed-off-by: NJekyll Lai <jekyll_lai@wistron.com> Signed-off-by: NAlan Cox <alan@linux.intel.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Jacob Pan 提交于
On the Intel MID devices SCU commands are issued to manage power off and the like. We need to issue different ones for non-Lincroft based devices. Signed-off-by: NAlek Du <alek.du@intel.com> Signed-off-by: NJacob Pan <jacob.jun.pan@linux.intel.com> Signed-off-by: NAlan Cox <alan@linux.intel.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Rafael J. Wysocki 提交于
Dell OptiPlex 990 is known to require PCI reboot, so add it to the reboot blacklist in pci_reboot_dmi_table[]. Signed-off-by: NRafael J. Wysocki <rjw@sisk.pl> Link: http://lkml.kernel.org/r/201111160019.51303.rjw@sisk.plSigned-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Jack Steiner 提交于
There was a mixup when the SGI UV2 hub chip was sent to be fabricated, and it ended up with the wrong part number in the HRP_NODE_ID mmr. Future versions of the chip will (may) have the correct part number. Change the UV infrastructure to recognize both part numbers as valid IDs of a UV2 hub chip. Signed-off-by: NJack Steiner <steiner@sgi.com> Link: http://lkml.kernel.org/r/20111129210058.GA20452@sgi.comSigned-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Mitsuo Hayasaka 提交于
The kernel stack overflow is checked in stack_overflow_check(), which may wrongly detect the overflow if the stack pointer in user space points to the kernel stack intentionally or accidentally. So, the actual overflow is never detected after this misdetection because WARN_ONCE() is used on the detection of it. This patch adds user-mode-vm checking before it to avoid this problem and bails out early if the user stack is used. Signed-off-by: NMitsuo Hayasaka <mitsuo.hayasaka.hu@hitachi.com> Cc: yrl.pp-manager.tt@hitachi.com Cc: Randy Dunlap <rdunlap@xenotime.net> Link: http://lkml.kernel.org/r/20111129060821.11076.55315.stgit@ltc219.sdl.hitachi.co.jpSigned-off-by: NIngo Molnar <mingo@elte.hu> Cc: "H. Peter Anvin" <hpa@zytor.com>
-
由 Robert Richter 提交于
On AMD family 10h we see firmware bug messages like the following: [Firmware Bug]: cpu 6, try to use APIC500 (LVT offset 0) for vector 0x10400, but the register is already in use for vector 0xf9 on another cpu [Firmware Bug]: cpu 6, IBS interrupt offset 0 not available (MSRC001103A=0x0000000000000100) [Firmware Bug]: using offset 1 for IBS interrupts [Firmware Bug]: workaround enabled for IBS LVT offset perf: AMD IBS detected (0x00000007) We always see this, since the offsets are not assigned by the BIOS for this family. Force LVT offset assignment in this case. If the OS assignment fails, fallback to BIOS settings and try to setup this. The fallback to BIOS settings weakens the family check since force_ibs_eilvt_setup() may fail e.g. in case of virtual machines. But setup may still succeed if BIOS offsets are correct. Other families don't have a workaround implemented that assigns LVT offsets. It's ok, to drop calling force_ibs_eilvt_setup() for that families. With the patch the [Firmware Bug] messages vanish. We see now: IBS: LVT offset 1 assigned perf: AMD IBS detected (0x00000007) Signed-off-by: NRobert Richter <robert.richter@amd.com> Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/20111109162225.GO12451@erda.amd.comSigned-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Peter Zijlstra 提交于
Cc: Stephane Eranian <eranian@google.com> Cc: stable@kernel.org Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Linus Torvalds 提交于
People with old AMD chips are getting hung boots, because commit bcb80e53 ("x86, microcode, AMD: Add microcode revision to /proc/cpuinfo") moved the microcode detection too early into "early_init_amd()". At that point we are *so* early in the booth that the exception tables haven't even been set up yet, so the whole rdmsr_safe(MSR_AMD64_PATCH_LEVEL, &c->microcode, &dummy); doesn't actually work: if the rdmsr does a GP fault (due to non-existant MSR register on older CPU's), we can't fix it up yet, and the boot fails. Fix it by simply moving the code to a slightly later point in the boot (init_amd() instead of early_init_amd()), since the kernel itself doesn't even really care about the microcode patchlevel at this point (or really ever: it's made available to user space in /proc/cpuinfo, and updated if you do a microcode load). Reported-tested-and-bisected-by: NLarry Finger <Larry.Finger@lwfinger.net> Tested-by: NBob Tracy <rct@gherkin.frus.com> Acked-by: NBorislav Petkov <borislav.petkov@amd.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 04 12月, 2011 1 次提交
-
-
由 Konrad Rzeszutek Wilk 提交于
The idea behind commit d91ee586 ("cpuidle: replace xen access to x86 pm_idle and default_idle") was to have one call - disable_cpuidle() which would make pm_idle not be molested by other code. It disallows cpuidle_idle_call to be set to pm_idle (which is excellent). But in the select_idle_routine() and idle_setup(), the pm_idle can still be set to either: amd_e400_idle, mwait_idle or default_idle. This depends on some CPU flags (MWAIT) and in AMD case on the type of CPU. In case of mwait_idle we can hit some instances where the hypervisor (Amazon EC2 specifically) sets the MWAIT and we get: Brought up 2 CPUs invalid opcode: 0000 [#1] SMP Pid: 0, comm: swapper Not tainted 3.1.0-0.rc6.git0.3.fc16.x86_64 #1 RIP: e030:[<ffffffff81015d1d>] [<ffffffff81015d1d>] mwait_idle+0x6f/0xb4 ... Call Trace: [<ffffffff8100e2ed>] cpu_idle+0xae/0xe8 [<ffffffff8149ee78>] cpu_bringup_and_idle+0xe/0x10 RIP [<ffffffff81015d1d>] mwait_idle+0x6f/0xb4 RSP <ffff8801d28ddf10> In the case of amd_e400_idle we don't get so spectacular crashes, but we do end up making an MSR which is trapped in the hypervisor, and then follow it up with a yield hypercall. Meaning we end up going to hypervisor twice instead of just once. The previous behavior before v3.0 was that pm_idle was set to default_idle regardless of select_idle_routine/idle_setup. We want to do that, but only for one specific case: Xen. This patch does that. Fixes RH BZ #739499 and Ubuntu #881076 Reported-by: NStefan Bader <stefan.bader@canonical.com> Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 22 11月, 2011 1 次提交
-
-
由 Al Viro 提交于
wrong register returned... Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 20 11月, 2011 1 次提交
-
-
由 Avi Kivity 提交于
Prevent tracing of preempt_disable() in get_cpu_var() in kvm_clock_read(). When CONFIG_DEBUG_PREEMPT is enabled, preempt_disable/enable() are traced and this causes the function_graph tracer to go into an infinite recursion. By open coding the preempt_disable() around the get_cpu_var(), we can use the notrace version which prevents preempt_disable/enable() from being traced and prevents the recursion. Based on a similar patch for Xen from Jeremy Fitzhardinge. Tested-by: NGleb Natapov <gleb@redhat.com> Acked-by: NSteven Rostedt <rostedt@goodmis.org> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
- 17 11月, 2011 5 次提交
-
-
由 Gleb Natapov 提交于
Signed-off-by: NGleb Natapov <gleb@redhat.com> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
由 Gleb Natapov 提交于
Support guest/host-only profiling by switch perf msrs on a guest entry if needed. Signed-off-by: NGleb Natapov <gleb@redhat.com> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
由 Gleb Natapov 提交于
Some cpus have special support for switching PERF_GLOBAL_CTRL msr. Add logic to detect if such support exists and works properly and extend msr switching code to use it if available. Also extend number of generic msr switching entries to 8. Signed-off-by: NGleb Natapov <gleb@redhat.com> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
由 Salman Qazi 提交于
(Added the missing signed-off-by line) In hundreds of days, the __cycles_2_ns calculation in sched_clock has an overflow. cyc * per_cpu(cyc2ns, cpu) exceeds 64 bits, causing the final value to become zero. We can solve this without losing any precision. We can decompose TSC into quotient and remainder of division by the scale factor, and then use this to convert TSC into nanoseconds. Signed-off-by: NSalman Qazi <sqazi@google.com> Acked-by: NJohn Stultz <johnstul@us.ibm.com> Reviewed-by: NPaul Turner <pjt@google.com> Cc: stable@kernel.org Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/20111115221121.7262.88871.stgit@dungbeetle.mtv.corp.google.comSigned-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Zhenzhong Duan 提交于
PVHVM running with more than 32 vcpus and pv_irq/pv_time enabled need VCPU placement to work, or else it will softlockup. CC: stable@kernel.org Acked-by: NStefano Stabellini <stefano.stabellini@eu.citrix.com> Signed-off-by: NZhenzhong Duan <zhenzhong.duan@oracle.com> Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
-