- 15 11月, 2012 10 次提交
-
-
由 Fenghua Yu 提交于
The first cpu in irq cfg->domain is likely to be CPU 0 and may not be available when CPU 0 is offline. Instead of using CPU 0 to handle retriggered irq, we use first available CPU which is online and in this irq's domain. Signed-off-by: NFenghua Yu <fenghua.yu@intel.com> Link: http://lkml.kernel.org/r/1352835171-3958-13-git-send-email-fenghua.yu@intel.comSigned-off-by: NH. Peter Anvin <hpa@linux.intel.com>
-
由 Fenghua Yu 提交于
Ask the first online CPU to save mtrr instead of asking BSP. BSP could be offline when mtrr_save_state() is called. Signed-off-by: NFenghua Yu <fenghua.yu@intel.com> Link: http://lkml.kernel.org/r/1352835171-3958-12-git-send-email-fenghua.yu@intel.comSigned-off-by: NH. Peter Anvin <hpa@linux.intel.com>
-
由 Fenghua Yu 提交于
Previously these functions were not run on the BSP (CPU 0, the boot processor) since the boot processor init would only be executed before this functionality was initialized. Signed-off-by: NFenghua Yu <fenghua.yu@intel.com> Link: http://lkml.kernel.org/r/1352835171-3958-11-git-send-email-fenghua.yu@intel.comSigned-off-by: NH. Peter Anvin <hpa@linux.intel.com>
-
由 Fenghua Yu 提交于
Instead of waiting for STARTUP after INITs, BSP will execute the BIOS boot-strap code which is not a desired behavior for waking up BSP. To avoid the boot-strap code, wake up CPU0 by NMI instead. This works to wake up soft offlined CPU0 only. If CPU0 is hard offlined (i.e. physically hot removed and then hot added), NMI won't wake it up. We'll change this code in the future to wake up hard offlined CPU0 if real platform and request are available. AP is still waken up as before by INIT, SIPI, SIPI sequence. Signed-off-by: NFenghua Yu <fenghua.yu@intel.com> Link: http://lkml.kernel.org/r/1352896613-25957-1-git-send-email-fenghua.yu@intel.comSigned-off-by: NH. Peter Anvin <hpa@linux.intel.com>
-
由 Fenghua Yu 提交于
start_cpu0() is defined in head_32.S for 32-bit. The function sets up stack and jumps to start_secondary() for CPU0 wake up. Signed-off-by: NFenghua Yu <fenghua.yu@intel.com> Link: http://lkml.kernel.org/r/1352835171-3958-9-git-send-email-fenghua.yu@intel.comSigned-off-by: NH. Peter Anvin <hpa@linux.intel.com>
-
由 Fenghua Yu 提交于
start_cpu0() is defined in head_64.S for 64-bit. The function sets up stack and jumps to start_secondary() for CPU0 wake up. Signed-off-by: NFenghua Yu <fenghua.yu@intel.com> Link: http://lkml.kernel.org/r/1352835171-3958-8-git-send-email-fenghua.yu@intel.comSigned-off-by: NH. Peter Anvin <hpa@linux.intel.com>
-
由 Fenghua Yu 提交于
Because x86 BIOS requires CPU0 to resume from sleep, suspend or hibernate can't be executed if CPU0 is detected offline. To make suspend or hibernate and further resume succeed, CPU0 must be online. Signed-off-by: NFenghua Yu <fenghua.yu@intel.com> Link: http://lkml.kernel.org/r/1352835171-3958-6-git-send-email-fenghua.yu@intel.comSigned-off-by: NH. Peter Anvin <hpa@linux.intel.com>
-
由 Fenghua Yu 提交于
Add smp_store_boot_cpu_info() to store cpu info for BSP during boot time. Now smp_store_cpu_info() stores cpu info for bringing up BSP or AP after it's offline. Continue to online CPU0 in native_cpu_up(). Continue to offline CPU0 in native_cpu_disable(). Signed-off-by: NFenghua Yu <fenghua.yu@intel.com> Link: http://lkml.kernel.org/r/1352835171-3958-5-git-send-email-fenghua.yu@intel.comSigned-off-by: NH. Peter Anvin <hpa@linux.intel.com>
-
由 Fenghua Yu 提交于
If CONFIG_BOOTPARAM_HOTPLUG_CPU is turned on, CPU0 hotplug feature is enabled by default. If CONFIG_BOOTPARAM_HOTPLUG_CPU is not turned on, CPU0 hotplug feature is not enabled by default. The kernel parameter cpu0_hotplug can enable CPU0 hotplug feature at boot. Currently the feature is supported on Intel platforms only. Signed-off-by: NFenghua Yu <fenghua.yu@intel.com> Link: http://lkml.kernel.org/r/1352835171-3958-4-git-send-email-fenghua.yu@intel.comSigned-off-by: NH. Peter Anvin <hpa@linux.intel.com>
-
由 Fenghua Yu 提交于
New config switch CONFIG_BOOTPARAM_HOTPLUG_CPU0 sets default state of whether the CPU0 hotplug is on or off. If the switch is off, CPU0 is not hotpluggable by default. But the CPU0 hotplug feature can still be turned on by kernel parameter cpu0_hotplug at boot. If the switch is on, CPU0 is always hotpluggable. The default value of the switch is off. Signed-off-by: NFenghua Yu <fenghua.yu@intel.com> Link: http://lkml.kernel.org/r/1352835171-3958-3-git-send-email-fenghua.yu@intel.comSigned-off-by: NH. Peter Anvin <hpa@linux.intel.com>
-
- 04 11月, 2012 1 次提交
-
-
由 Jan Beulich 提交于
While copying the argument structures in HYPERVISOR_event_channel_op() and HYPERVISOR_physdev_op() into the local variable is sufficiently safe even if the actual structure is smaller than the container one, copying back eventual output values the same way isn't: This may collide with on-stack variables (particularly "rc") which may change between the first and second memcpy() (i.e. the second memcpy() could discard that change). Move the fallback code into out-of-line functions, and handle all of the operations known by this old a hypervisor individually: Some don't require copying back anything at all, and for the rest use the individual argument structures' sizes rather than the container's. Reported-by: NDan Carpenter <dan.carpenter@oracle.com> Signed-off-by: NJan Beulich <jbeulich@suse.com> [v2: Reduce #define/#undef usage in HYPERVISOR_physdev_op_compat().] [v3: Fix compile errors when modules use said hypercalls] [v4: Add xen_ prefix to the HYPERCALL_..] [v5: Alter the name and only EXPORT_SYMBOL_GPL one of them] Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
-
- 01 11月, 2012 2 次提交
-
-
由 Xiao Guangrong 提交于
After commit b3356bf0 (KVM: emulator: optimize "rep ins" handling), the pieces of io data can be collected and write them to the guest memory or MMIO together Unfortunately, kvm splits the mmio access into 8 bytes and store them to vcpu->mmio_fragments. If the guest uses "rep ins" to move large data, it will cause vcpu->mmio_fragments overflow The bug can be exposed by isapc (-M isapc): [23154.818733] general protection fault: 0000 [#1] SMP DEBUG_PAGEALLOC [ ......] [23154.858083] Call Trace: [23154.859874] [<ffffffffa04f0e17>] kvm_get_cr8+0x1d/0x28 [kvm] [23154.861677] [<ffffffffa04fa6d4>] kvm_arch_vcpu_ioctl_run+0xcda/0xe45 [kvm] [23154.863604] [<ffffffffa04f5a1a>] ? kvm_arch_vcpu_load+0x17b/0x180 [kvm] Actually, we can use one mmio_fragment to store a large mmio access then split it when we pass the mmio-exit-info to userspace. After that, we only need two entries to store mmio info for the cross-mmio pages access Signed-off-by: NXiao Guangrong <xiaoguangrong@linux.vnet.ibm.com> Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
-
由 Konrad Rzeszutek Wilk 提交于
As Mukesh explained it, the MMUEXT_TLB_FLUSH_ALL allows the hypervisor to do a TLB flush on all active vCPUs. If instead we were using the generic one (which ends up being xen_flush_tlb) we end up making the MMUEXT_TLB_FLUSH_LOCAL hypercall. But before we make that hypercall the kernel will IPI all of the vCPUs (even those that were asleep from the hypervisor perspective). The end result is that we needlessly wake them up and do a TLB flush when we can just let the hypervisor do it correctly. This patch gives around 50% speed improvement when migrating idle guest's from one host to another. Oracle-bug: 14630170 CC: stable@vger.kernel.org Tested-by: NJingjie Jiang <jingjie.jiang@oracle.com> Suggested-by: NMukesh Rathor <mukesh.rathor@oracle.com> Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
-
- 30 10月, 2012 1 次提交
-
-
由 Olaf Hering 提交于
Signed-off-by: NOlaf Hering <olaf@aepfle.de> Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
-
- 26 10月, 2012 2 次提交
-
-
由 Yinghai Lu 提交于
Commit 844ab6f9 x86, mm: Find_early_table_space based on ranges that are actually being mapped added back some lines back wrongly that has been removed in commit 7b16bbf9 Revert "x86/mm: Fix the size calculation of mapping tables" remove them again. Signed-off-by: NYinghai Lu <yinghai@kernel.org> Link: http://lkml.kernel.org/r/CAE9FiQW_vuaYQbmagVnxT2DGsYc=9tNeAbdBq53sYkitPOwxSQ@mail.gmail.comAcked-by: NJacob Shin <jacob.shin@amd.com> Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>
-
由 Olof Johansson 提交于
When 32-bit EFI is used with 64-bit kernel (or vice versa), turn off efi_enabled once setup is done. Beyond setup, it is normally used to determine if runtime services are available and we will have none. This will resolve issues stemming from efivars modprobe panicking on a 32/64-bit setup, as well as some reboot issues on similar setups. Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=45991Reported-by: NMarko Kohtala <marko.kohtala@gmail.com> Reported-by: NMaxim Kammerer <mk@dee.su> Signed-off-by: NOlof Johansson <olof@lixom.net> Acked-by: NMaarten Lankhorst <maarten.lankhorst@canonical.com> Cc: stable@kernel.org # 3.4 - 3.6 Cc: Matthew Garrett <mjg@redhat.com> Signed-off-by: NMatt Fleming <matt.fleming@intel.com>
-
- 25 10月, 2012 3 次提交
-
-
由 Jacob Shin 提交于
Current logic finds enough space for direct mapping page tables from 0 to end. Instead, we only need to find enough space to cover mr[0].start to mr[nr_range].end -- the range that is actually being mapped by init_memory_mapping() This is needed after 1bbbbe77, to address the panic reported here: https://lkml.org/lkml/2012/10/20/160 https://lkml.org/lkml/2012/10/21/157Signed-off-by: NJacob Shin <jacob.shin@amd.com> Link: http://lkml.kernel.org/r/20121024195311.GB11779@jshin-ToonieTested-by: NTom Rini <trini@ti.com> Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>
-
由 Yinghai Lu 提交于
We need to handle E820_RAM and E820_RESERVED_KERNEL at the same time. Also memblock has page aligned range for ram, so we could avoid mapping partial pages. Signed-off-by: NYinghai Lu <yinghai@kernel.org> Link: http://lkml.kernel.org/r/CAE9FiQVZirvaBMFYRfXMmWEcHbKSicQEHz4VAwUv0xFCk51ZNw@mail.gmail.comAcked-by: NJacob Shin <jacob.shin@amd.com> Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com> Cc: <stable@vger.kernel.org>
-
由 Yinghai Lu 提交于
We will not map partial pages, so need to make sure memblock allocation will not allocate those bytes out. Also we will use for_each_mem_pfn_range() to loop to map memory range to keep them consistent. Signed-off-by: NYinghai Lu <yinghai@kernel.org> Link: http://lkml.kernel.org/r/CAE9FiQVZirvaBMFYRfXMmWEcHbKSicQEHz4VAwUv0xFCk51ZNw@mail.gmail.comAcked-by: NJacob Shin <jacob.shin@amd.com> Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com> Cc: <stable@vger.kernel.org>
-
- 24 10月, 2012 13 次提交
-
-
由 Dimitri Sivanich 提交于
Posting this patch to fix an issue concerning sparse irq's that I raised a while back. There was discussion about adding refcounting to sparse irqs (to fix other potential race conditions), but that does not appear to have been addressed yet. This covers the only issue of this type that I've encountered in this area. A NULL pointer dereference can occur in smp_irq_move_cleanup_interrupt() if we haven't yet setup the irq_cfg pointer in the irq_desc.irq_data.chip_data. In create_irq_nr() there is a window where we have set vector_irq in __assign_irq_vector(), but not yet called irq_set_chip_data() to set the irq_cfg pointer. Should an IRQ_MOVE_CLEANUP_VECTOR hit the cpu in question during this time, smp_irq_move_cleanup_interrupt() will attempt to process the aforementioned irq, but panic when accessing irq_cfg. Only continue processing the irq if irq_cfg is non-NULL. Signed-off-by: NDimitri Sivanich <sivanich@sgi.com> Cc: Suresh Siddha <suresh.b.siddha@intel.com> Cc: Joerg Roedel <joerg.roedel@amd.com> Cc: Yinghai Lu <yinghai@kernel.org> Cc: Alexander Gordeev <agordeev@redhat.com> Link: http://lkml.kernel.org/r/20121016125021.GA22935@sgi.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Wei Yongjun 提交于
The variable port is initialized but never used otherwise, so remove the unused variable. dpatch engine is used to auto generate this patch. (https://github.com/weiyj/dpatch) Signed-off-by: NWei Yongjun <yongjun_wei@trendmicro.com.cn> Cc: Yan, Zheng <zheng.z.yan@intel.com> Cc: a.p.zijlstra@chello.nl Cc: paulus@samba.org Cc: acme@ghostprotocols.net Link: http://lkml.kernel.org/r/CAPgLHd8NZkYSkZm22FpZxiEh6HcA0q-V%3D29vdnheiDhgrJZ%2Byw@mail.gmail.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Matt Fleming 提交于
Calling __pa() with an ioremap'd address is invalid. If we encounter an efi_memory_desc_t without EFI_MEMORY_WB set in ->attribute we currently call set_memory_uc(), which in turn calls __pa() on a potentially ioremap'd address. On CONFIG_X86_32 this results in the following oops: BUG: unable to handle kernel paging request at f7f22280 IP: [<c10257b9>] reserve_ram_pages_type+0x89/0x210 *pdpt = 0000000001978001 *pde = 0000000001ffb067 *pte = 0000000000000000 Oops: 0000 [#1] PREEMPT SMP Modules linked in: Pid: 0, comm: swapper Not tainted 3.0.0-acpi-efi-0805 #3 EIP: 0060:[<c10257b9>] EFLAGS: 00010202 CPU: 0 EIP is at reserve_ram_pages_type+0x89/0x210 EAX: 0070e280 EBX: 38714000 ECX: f7814000 EDX: 00000000 ESI: 00000000 EDI: 38715000 EBP: c189fef0 ESP: c189fea8 DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068 Process swapper (pid: 0, ti=c189e000 task=c18bbe60 task.ti=c189e000) Stack: 80000200 ff108000 00000000 c189ff00 00038714 00000000 00000000 c189fed0 c104f8ca 00038714 00000000 00038715 00000000 00000000 00038715 00000000 00000010 38715000 c189ff48 c1025aff 38715000 00000000 00000010 00000000 Call Trace: [<c104f8ca>] ? page_is_ram+0x1a/0x40 [<c1025aff>] reserve_memtype+0xdf/0x2f0 [<c1024dc9>] set_memory_uc+0x49/0xa0 [<c19334d0>] efi_enter_virtual_mode+0x1c2/0x3aa [<c19216d4>] start_kernel+0x291/0x2f2 [<c19211c7>] ? loglevel+0x1b/0x1b [<c19210bf>] i386_start_kernel+0xbf/0xc8 The only time we can call set_memory_uc() for a memory region is when it is part of the direct kernel mapping. For the case where we ioremap a memory region we must leave it alone. This patch reimplements the fix from e8c71062 ("x86, efi: Calling __pa() with an ioremap()ed address is invalid") which was reverted in e1ad783b because it caused a regression on some MacBooks (they hung at boot). The regression was caused because the commit only marked EFI_RUNTIME_SERVICES_DATA as E820_RESERVED_EFI, when it should have marked all regions that have the EFI_MEMORY_RUNTIME attribute. Despite first impressions, it's not possible to use ioremap_cache() to map all cached memory regions on CONFIG_X86_64 because of the way that the memory map might be configured as detailed in the following bug report, https://bugzilla.redhat.com/show_bug.cgi?id=748516 e.g. some of the EFI memory regions *need* to be mapped as part of the direct kernel mapping. Signed-off-by: NMatt Fleming <matt.fleming@intel.com> Cc: Matthew Garrett <mjg@redhat.com> Cc: Zhang Rui <rui.zhang@intel.com> Cc: Huang Ying <huang.ying.caritas@gmail.com> Cc: Keith Packard <keithp@keithp.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> Link: http://lkml.kernel.org/r/1350649546-23541-1-git-send-email-matt@console-pimps.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Vince Weaver 提交于
Although based on the Intel P6 design, the interrupt mechnanism for KNC more closely resembles the Intel architectural perfmon one. We can't just re-use that code though, because KNC has different MSR numbers for the status and ack registers. In this case we just cut-and paste from perf_event_intel.c with some minor changes, as it looks like it would not be worth the trouble to change that code to be MSR-configurable. Signed-off-by: NVince Weaver <vincent.weaver@maine.edu> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> Cc: eranian@gmail.com Cc: Meadows Lawrence F <lawrence.f.meadows@intel.com> Link: http://lkml.kernel.org/r/alpine.DEB.2.02.1210171304410.23243@vincent-weaver-1.um.maine.edu [ Small stylistic edits. ] Signed-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Vince Weaver 提交于
x86_pmu.enable() is called from x86_pmu_enable() with cpuc->enabled set to 0. This means we weren't re-enabling the counters after a context switch. This patch just removes the check, as it should't be necessary (and the equivelent x86_ generic code does not have the checks). The origin of this problem is the KNC driver being based on the P6 one. The P6 driver also has this issue, but works anyway due to various lucky accidents. Signed-off-by: NVince Weaver <vincent.weaver@maine.edu> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> Cc: eranian@gmail.com Cc: Meadows Cc: Lawrence F <lawrence.f.meadows@intel.com> Link: http://lkml.kernel.org/r/alpine.DEB.2.02.1210171303290.23243@vincent-weaver-1.um.maine.eduSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Vince Weaver 提交于
Early versions of Intel KNC chips have a bug where bits above 32 were not properly set. We worked around this by only using the bottom 32 bits (out of 40 that should be available). It turns out this workaround breaks overflow handling. The buggy silicon will in theory never be used in production systems, so remove this workaround so we get proper overflow support. Signed-off-by: NVince Weaver <vincent.weaver@maine.edu> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> Cc: eranian@gmail.com Cc: Meadows Lawrence F <lawrence.f.meadows@intel.com> Link: http://lkml.kernel.org/r/alpine.DEB.2.02.1210171302140.23243@vincent-weaver-1.um.maine.eduSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Yan, Zheng 提交于
This, beyond handling corner cases, also fixes some build warnings: arch/x86/kernel/cpu/perf_event_intel_uncore.c: In function ‘snbep_uncore_pci_disable_box’: arch/x86/kernel/cpu/perf_event_intel_uncore.c:124:9: warning: ‘config’ is used uninitialized in this function [-Wuninitialized] arch/x86/kernel/cpu/perf_event_intel_uncore.c: In function ‘snbep_uncore_pci_enable_box’: arch/x86/kernel/cpu/perf_event_intel_uncore.c:135:9: warning: ‘config’ is used uninitialized in this function [-Wuninitialized] arch/x86/kernel/cpu/perf_event_intel_uncore.c: In function ‘snbep_uncore_pci_read_counter’: arch/x86/kernel/cpu/perf_event_intel_uncore.c:164:2: warning: ‘count’ is used uninitialized in this function [-Wuninitialized] Signed-off-by: NYan, Zheng <zheng.z.yan@intel.com> Cc: a.p.zijlstra@chello.nl Link: http://lkml.kernel.org/r/1351068140-13456-1-git-send-email-zheng.z.yan@intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Jan Beulich 提交于
Commit 20167d34 ("x86-64: Fix accounting in kernel_physical_mapping_init()") went a little too far by entirely removing the counting of pre-populated page tables: this should be done at boot time (to cover the page tables set up in early boot code), but shouldn't be done during memory hot add. Hence, re-add the removed increments of "pages", but make them and the one in phys_pte_init() conditional upon !after_bootmem. Reported-Acked-and-Tested-by: NHugh Dickins <hughd@google.com> Signed-off-by: NJan Beulich <jbeulich@suse.com> Cc: <stable@kernel.org> Link: http://lkml.kernel.org/r/506DAFBA020000780009FA8C@nat28.tlf.novell.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Vince Weaver 提交于
Between 2.6.33 and 2.6.34 the PMU code was made modular. The x86_pmu_enable() call was extended to disable cpuc->enabled and iterate the counters, enabling one at a time, before calling enable_all() at the end, followed by re-enabling cpuc->enabled. Since cpuc->enabled was set to 0, that change effectively caused the "val |= ARCH_PERFMON_EVENTSEL_ENABLE;" code in p6_pmu_enable_event() and p6_pmu_disable_event() to be dead code that was never called. This change removes this code (which was confusing) and adds some extra commentary to make it more clear what is going on. Signed-off-by: NVince Weaver <vincent.weaver@maine.edu> Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/alpine.DEB.2.02.1210191732000.14552@vincent-weaver-1.um.maine.eduSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Vince Weaver 提交于
This patch updates the generic events on p6, including some new extended cache events. Values for these events were taken from the equivelant PAPI predefined events. Tested on a Pentium II. Signed-off-by: NVince Weaver <vincent.weaver@maine.edu> Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/alpine.DEB.2.02.1210191730080.14552@vincent-weaver-1.um.maine.eduSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Vince Weaver 提交于
According to Intel SDM Volume 3B, FP_ASSIST is limited to Counter 1 only, not Counter 0. Tested on a Pentium II. Signed-off-by: NVince Weaver <vincent.weaver@maine.edu> Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/alpine.DEB.2.02.1210191728570.14552@vincent-weaver-1.um.maine.eduSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Dave Young 提交于
Commit: 722bc6b1 x86/mm: Fix the size calculation of mapping tables Tried to address the issue that the first 2/4M should use 4k pages if PSE enabled, but extra counts should only be valid for x86_32. This commit caused a kdump regression: the kdump kernel hangs. Work is in progress to fundamentally fix the various page table initialization issues that we have, via the design suggested by H. Peter Anvin, but it's not ready yet to be merged. So, to get a working kdump revert to the last known working version, which is the revert of this commit and of a followup fix (which was incomplete): bd2753b2 x86/mm: Only add extra pages count for the first memory range during pre-allocation Tested kdump on physical and virtual machines. Signed-off-by: NDave Young <dyoung@redhat.com> Acked-by: NYinghai Lu <yinghai@kernel.org> Acked-by: NCong Wang <xiyou.wangcong@gmail.com> Acked-by: NFlavio Leitner <fbl@redhat.com> Tested-by: NFlavio Leitner <fbl@redhat.com> Cc: Dan Carpenter <dan.carpenter@oracle.com> Cc: Cong Wang <xiyou.wangcong@gmail.com> Cc: Flavio Leitner <fbl@redhat.com> Cc: Tejun Heo <tj@kernel.org> Cc: ianfang.cn@gmail.com Cc: Vivek Goyal <vgoyal@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: <stable@kernel.org> Signed-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Andre Przywara 提交于
In check_hw_exists() we try to detect non-emulated MSR accesses by writing an arbitrary value into one of the PMU registers and check if it's value after a readout is still the same. This algorithm silently assumes that the register does not contain the magic value already, which is wrong in at least one situation. Fix the algorithm to really do a read-modify-write cycle. This fixes a warning under Xen under some circumstances on AMD family 10h CPUs. The reasons in more details actually sound like a story from Believe It or Not!: First you need an AMD family 10h/12h CPU. These do not reset the PERF_CTR registers on a reboot. Now you boot bare metal Linux, which goes successfully through this check, but leaves the magic value of 0xabcd in the register. You don't use the performance counters, but do a reboot (warm reset). Then you choose to boot Xen. The check will be triggered with a recent Linux kernel as Dom0 again, trying to write 0xabcd into the MSR. Xen silently drops the write (expected), but the subsequent read will return the value in the register, which just happens to be the expected magic value. Thus the test misleadingly succeeds, leaving the kernel in the belief that the PMU is available. This will trigger the following message: [ 0.020294] ------------[ cut here ]------------ [ 0.020311] WARNING: at arch/x86/xen/enlighten.c:730 xen_apic_write+0x15/0x17() [ 0.020318] Hardware name: empty [ 0.020323] Modules linked in: [ 0.020334] Pid: 1, comm: swapper/0 Not tainted 3.3.8 #7 [ 0.020340] Call Trace: [ 0.020354] [<ffffffff81050379>] warn_slowpath_common+0x80/0x98 [ 0.020369] [<ffffffff810503a6>] warn_slowpath_null+0x15/0x17 [ 0.020378] [<ffffffff810034df>] xen_apic_write+0x15/0x17 [ 0.020392] [<ffffffff8101cb2b>] perf_events_lapic_init+0x2e/0x30 [ 0.020410] [<ffffffff81ee4dd0>] init_hw_perf_events+0x250/0x407 [ 0.020419] [<ffffffff81ee4b80>] ? check_bugs+0x2d/0x2d [ 0.020430] [<ffffffff81002181>] do_one_initcall+0x7a/0x131 [ 0.020444] [<ffffffff81edbbf9>] kernel_init+0x91/0x15d [ 0.020456] [<ffffffff817caaa4>] kernel_thread_helper+0x4/0x10 [ 0.020471] [<ffffffff817c347c>] ? retint_restore_args+0x5/0x6 [ 0.020481] [<ffffffff817caaa0>] ? gs_change+0x13/0x13 [ 0.020500] ---[ end trace a7919e7f17c0a725 ]--- The new code will change every of the 16 low bits read from the register and tries to write and read-back that modified number from the MSR. Signed-off-by: NAndre Przywara <andre.przywara@amd.com> Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Avi Kivity <avi@redhat.com> Link: http://lkml.kernel.org/r/1349797115-28346-2-git-send-email-andre.przywara@amd.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
- 23 10月, 2012 3 次提交
-
-
由 Sasha Levin 提交于
KVM_PV_REASON_PAGE_NOT_PRESENT kicks cpu out of idleness, but we haven't marked that spot as an exit from idleness. Not doing so can cause RCU warnings such as: [ 732.788386] =============================== [ 732.789803] [ INFO: suspicious RCU usage. ] [ 732.790032] 3.7.0-rc1-next-20121019-sasha-00002-g6d8d02d-dirty #63 Tainted: G W [ 732.790032] ------------------------------- [ 732.790032] include/linux/rcupdate.h:738 rcu_read_lock() used illegally while idle! [ 732.790032] [ 732.790032] other info that might help us debug this: [ 732.790032] [ 732.790032] [ 732.790032] RCU used illegally from idle CPU! [ 732.790032] rcu_scheduler_active = 1, debug_locks = 1 [ 732.790032] RCU used illegally from extended quiescent state! [ 732.790032] 2 locks held by trinity-child31/8252: [ 732.790032] #0: (&rq->lock){-.-.-.}, at: [<ffffffff83a67528>] __schedule+0x178/0x8f0 [ 732.790032] #1: (rcu_read_lock){.+.+..}, at: [<ffffffff81152bde>] cpuacct_charge+0xe/0x200 [ 732.790032] [ 732.790032] stack backtrace: [ 732.790032] Pid: 8252, comm: trinity-child31 Tainted: G W 3.7.0-rc1-next-20121019-sasha-00002-g6d8d02d-dirty #63 [ 732.790032] Call Trace: [ 732.790032] [<ffffffff8118266b>] lockdep_rcu_suspicious+0x10b/0x120 [ 732.790032] [<ffffffff81152c60>] cpuacct_charge+0x90/0x200 [ 732.790032] [<ffffffff81152bde>] ? cpuacct_charge+0xe/0x200 [ 732.790032] [<ffffffff81158093>] update_curr+0x1a3/0x270 [ 732.790032] [<ffffffff81158a6a>] dequeue_entity+0x2a/0x210 [ 732.790032] [<ffffffff81158ea5>] dequeue_task_fair+0x45/0x130 [ 732.790032] [<ffffffff8114ae29>] dequeue_task+0x89/0xa0 [ 732.790032] [<ffffffff8114bb9e>] deactivate_task+0x1e/0x20 [ 732.790032] [<ffffffff83a67c29>] __schedule+0x879/0x8f0 [ 732.790032] [<ffffffff8117e20d>] ? trace_hardirqs_off+0xd/0x10 [ 732.790032] [<ffffffff810a37a5>] ? kvm_async_pf_task_wait+0x1d5/0x2b0 [ 732.790032] [<ffffffff83a67cf5>] schedule+0x55/0x60 [ 732.790032] [<ffffffff810a37c4>] kvm_async_pf_task_wait+0x1f4/0x2b0 [ 732.790032] [<ffffffff81139e50>] ? abort_exclusive_wait+0xb0/0xb0 [ 732.790032] [<ffffffff81139c25>] ? prepare_to_wait+0x25/0x90 [ 732.790032] [<ffffffff810a3a66>] do_async_page_fault+0x56/0xa0 [ 732.790032] [<ffffffff83a6a6e8>] async_page_fault+0x28/0x30 Signed-off-by: NSasha Levin <sasha.levin@oracle.com> Acked-by: NGleb Natapov <gleb@redhat.com> Acked-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
由 Gleb Natapov 提交于
Signed-off-by: NGleb Natapov <gleb@redhat.com> Reviewed-by: NChegu Vinod <chegu_vinod@hp.com> Tested-by: NChegu Vinod <chegu_vinod@hp.com> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
由 Xiao Guangrong 提交于
We can not directly call kvm_release_pfn_clean to release the pfn since we can meet noslot pfn which is used to cache mmio info into spte Signed-off-by: NXiao Guangrong <xiaoguangrong@linux.vnet.ibm.com> Cc: stable@vger.kernel.org Signed-off-by: NAvi Kivity <avi@redhat.com>
-
- 20 10月, 2012 5 次提交
-
-
由 Yan, Zheng 提交于
Initializing uncore PMU on virtualized CPU may hang the kernel. This is because kvm does not emulate the entire hardware. Thers are lots of uncore related MSRs, making kvm enumerate them all is a non-trival task. So just disable uncore on virtualized CPU. Signed-off-by: NYan, Zheng <zheng.z.yan@intel.com> Tested-by: NPekka Enberg <penberg@kernel.org> Cc: a.p.zijlstra@chello.nl Cc: eranian@google.com Cc: andi@firstfloor.org Cc: avi@redhat.com Link: http://lkml.kernel.org/r/1345540117-14164-1-git-send-email-zheng.z.yan@intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 David Vrabel 提交于
In 32 bit guests, if a userspace process has %eax == -ERESTARTSYS (-512) or -ERESTARTNOINTR (-513) when it is interrupted by an event /and/ the process has a pending signal then %eip (and %eax) are corrupted when returning to the main process after handling the signal. The application may then crash with SIGSEGV or a SIGILL or it may have subtly incorrect behaviour (depending on what instruction it returned to). The occurs because handle_signal() is incorrectly thinking that there is a system call that needs to restarted so it adjusts %eip and %eax to re-execute the system call instruction (even though user space had not done a system call). If %eax == -514 (-ERESTARTNOHAND (-514) or -ERESTART_RESTARTBLOCK (-516) then handle_signal() only corrupted %eax (by setting it to -EINTR). This may cause the application to crash or have incorrect behaviour. handle_signal() assumes that regs->orig_ax >= 0 means a system call so any kernel entry point that is not for a system call must push a negative value for orig_ax. For example, for physical interrupts on bare metal the inverse of the vector is pushed and page_fault() sets regs->orig_ax to -1, overwriting the hardware provided error code. xen_hypervisor_callback() was incorrectly pushing 0 for orig_ax instead of -1. Classic Xen kernels pushed %eax which works as %eax cannot be both non-negative and -RESTARTSYS (etc.), but using -1 is consistent with other non-system call entry points and avoids some of the tests in handle_signal(). There were similar bugs in xen_failsafe_callback() of both 32 and 64-bit guests. If the fault was corrected and the normal return path was used then 0 was incorrectly pushed as the value for orig_ax. Signed-off-by: NDavid Vrabel <david.vrabel@citrix.com> Acked-by: NJan Beulich <JBeulich@suse.com> Acked-by: NIan Campbell <ian.campbell@citrix.com> Cc: stable@vger.kernel.org Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
-
由 Ian Campbell 提交于
This correctly sizes it as 64 bit on ARM but leaves it as unsigned long on x86 (therefore no intended change on x86). The long and ulong guest handles are now unused (and a bit dangerous) so remove them. Acked-by: NStefano Stabellini <stefano.stabellini@eu.citrix.com> Signed-off-by: NIan Campbell <ian.campbell@citrix.com> Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
-
由 Ian Campbell 提交于
Define PRI macros for xen_ulong_t and xen_pfn_t and use to fix: drivers/xen/sys-hypervisor.c:288:4: warning: format '%lx' expects argument of type 'long unsigned int', but argument 3 has type 'xen_ulong_t' [-Wformat] Ideally this would use PRIx64 on ARM but these (or equivalent) don't seem to be available in the kernel. Acked-by: NStefano Stabellini <stefano.stabellini@eu.citrix.com> Signed-off-by: NIan Campbell <ian.campbell@citrix.com> Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
-
由 Wei Yongjun 提交于
Remove duplicated include. dpatch engine is used to auto generate this patch. (https://github.com/weiyj/dpatch) CC: stable@vger.kernel.org Signed-off-by: NWei Yongjun <yongjun_wei@trendmicro.com.cn> Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
-