- 17 4月, 2013 8 次提交
-
-
由 Konrad Rzeszutek Wilk 提交于
See git commit f10cd522 (xen: disable PV spinlocks on HVM) for details. But we did not disable it everywhere - which means that when we boot as PVHVM we end up allocating per-CPU irq line for spinlock. This fixes that. Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
-
由 Konrad Rzeszutek Wilk 提交于
The default (uninitialized) value of the IRQ line is -1. Check if we already have allocated an spinlock interrupt line and if somebody is trying to do it again. Also set it to -1 when we offline the CPU. Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
-
由 Konrad Rzeszutek Wilk 提交于
If the timer interrupt has been de-init or is just now being initialized, the default value of -1 should be preset as interrupt line. Check for that and if something is odd WARN us. Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
-
由 Konrad Rzeszutek Wilk 提交于
When we online the CPU, we get this splat: smpboot: Booting Node 0 Processor 1 APIC 0x2 installing Xen timer for CPU 1 BUG: sleeping function called from invalid context at /home/konrad/ssd/konrad/linux/mm/slab.c:3179 in_atomic(): 1, irqs_disabled(): 0, pid: 0, name: swapper/1 Pid: 0, comm: swapper/1 Not tainted 3.9.0-rc6upstream-00001-g3884fad #1 Call Trace: [<ffffffff810c1fea>] __might_sleep+0xda/0x100 [<ffffffff81194617>] __kmalloc_track_caller+0x1e7/0x2c0 [<ffffffff81303758>] ? kasprintf+0x38/0x40 [<ffffffff813036eb>] kvasprintf+0x5b/0x90 [<ffffffff81303758>] kasprintf+0x38/0x40 [<ffffffff81044510>] xen_setup_timer+0x30/0xb0 [<ffffffff810445af>] xen_hvm_setup_cpu_clockevents+0x1f/0x30 [<ffffffff81666d0a>] start_secondary+0x19c/0x1a8 The solution to that is use kasprintf in the CPU hotplug path that 'online's the CPU. That is, do it in in xen_hvm_cpu_notify, and remove the call to in xen_hvm_setup_cpu_clockevents. Unfortunatly the later is not a good idea as the bootup path does not use xen_hvm_cpu_notify so we would end up never allocating timer%d interrupt lines when booting. As such add the check for atomic() to continue. CC: stable@vger.kernel.org Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
-
由 Konrad Rzeszutek Wilk 提交于
While we don't use the spinlock interrupt line (see for details commit f10cd522 - xen: disable PV spinlocks on HVM) - we should still do the proper init / deinit sequence. We did not do that correctly and for the CPU init for PVHVM guest we would allocate an interrupt line - but failed to deallocate the old interrupt line. This resulted in leakage of an irq_desc but more importantly this splat as we online an offlined CPU: genirq: Flags mismatch irq 71. 0002cc20 (spinlock1) vs. 0002cc20 (spinlock1) Pid: 2542, comm: init.late Not tainted 3.9.0-rc6upstream #1 Call Trace: [<ffffffff811156de>] __setup_irq+0x23e/0x4a0 [<ffffffff81194191>] ? kmem_cache_alloc_trace+0x221/0x250 [<ffffffff811161bb>] request_threaded_irq+0xfb/0x160 [<ffffffff8104c6f0>] ? xen_spin_trylock+0x20/0x20 [<ffffffff813a8423>] bind_ipi_to_irqhandler+0xa3/0x160 [<ffffffff81303758>] ? kasprintf+0x38/0x40 [<ffffffff8104c6f0>] ? xen_spin_trylock+0x20/0x20 [<ffffffff810cad35>] ? update_max_interval+0x15/0x40 [<ffffffff816605db>] xen_init_lock_cpu+0x3c/0x78 [<ffffffff81660029>] xen_hvm_cpu_notify+0x29/0x33 [<ffffffff81676bdd>] notifier_call_chain+0x4d/0x70 [<ffffffff810bb2a9>] __raw_notifier_call_chain+0x9/0x10 [<ffffffff8109402b>] __cpu_notify+0x1b/0x30 [<ffffffff8166834a>] _cpu_up+0xa0/0x14b [<ffffffff816684ce>] cpu_up+0xd9/0xec [<ffffffff8165f754>] store_online+0x94/0xd0 [<ffffffff8141d15b>] dev_attr_store+0x1b/0x20 [<ffffffff81218f44>] sysfs_write_file+0xf4/0x170 [<ffffffff811a2864>] vfs_write+0xb4/0x130 [<ffffffff811a302a>] sys_write+0x5a/0xa0 [<ffffffff8167ada9>] system_call_fastpath+0x16/0x1b cpu 1 spinlock event irq -16 smpboot: Booting Node 0 Processor 1 APIC 0x2 And if one looks at the /proc/interrupts right after offlining (CPU1): 70: 0 0 xen-percpu-ipi spinlock0 71: 0 0 xen-percpu-ipi spinlock1 77: 0 0 xen-percpu-ipi spinlock2 There is the oddity of the 'spinlock1' still being present. CC: stable@vger.kernel.org Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
-
由 Konrad Rzeszutek Wilk 提交于
In the PVHVM path when we do CPU online/offline path we would leak the timer%d IRQ line everytime we do a offline event. The online path (xen_hvm_setup_cpu_clockevents via x86_cpuinit.setup_percpu_clockev) would allocate a new interrupt line for the timer%d. But we would still use the old interrupt line leading to: kernel BUG at /home/konrad/ssd/konrad/linux/kernel/hrtimer.c:1261! invalid opcode: 0000 [#1] SMP RIP: 0010:[<ffffffff810b9e21>] [<ffffffff810b9e21>] hrtimer_interrupt+0x261/0x270 .. snip.. <IRQ> [<ffffffff810445ef>] xen_timer_interrupt+0x2f/0x1b0 [<ffffffff81104825>] ? stop_machine_cpu_stop+0xb5/0xf0 [<ffffffff8111434c>] handle_irq_event_percpu+0x7c/0x240 [<ffffffff811175b9>] handle_percpu_irq+0x49/0x70 [<ffffffff813a74a3>] __xen_evtchn_do_upcall+0x1c3/0x2f0 [<ffffffff813a760a>] xen_evtchn_do_upcall+0x2a/0x40 [<ffffffff8167c26d>] xen_hvm_callback_vector+0x6d/0x80 <EOI> [<ffffffff81666d01>] ? start_secondary+0x193/0x1a8 [<ffffffff81666cfd>] ? start_secondary+0x18f/0x1a8 There is also the oddity (timer1) in the /proc/interrupts after offlining CPU1: 64: 1121 0 xen-percpu-virq timer0 78: 0 0 xen-percpu-virq timer1 84: 0 2483 xen-percpu-virq timer2 This patch fixes it. Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com> CC: stable@vger.kernel.org
-
由 Jan Beulich 提交于
For quite a few Xen versions, this wasn't the IRQ vector anymore anyway, and it is not being used by the kernel for anything. Hence drop the field from struct irq_info, and respective function parameters. Signed-off-by: NJan Beulich <jbeulich@suse.com> Acked-by: NStefano Stabellini <stefano.stabellini@eu.citrix.com> Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
-
由 David Vrabel 提交于
During early setup of a dom0 kernel, populate boot_params with the Enhanced Disk Drive (EDD) and MBR signature data. This makes information on the BIOS boot device available in /sys/firmware/edd/. Signed-off-by: NDavid Vrabel <david.vrabel@citrix.com> Acked-by: NJan Beulich <jbeulich@suse.com> Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
-
- 07 4月, 2013 1 次提交
-
-
由 Andrew Honig 提交于
This patch adds support for kvm_gfn_to_hva_cache_init functions for reads and writes that will cross a page. If the range falls within the same memslot, then this will be a fast operation. If the range is split between two memslots, then the slower kvm_read_guest and kvm_write_guest are used. Tested: Test against kvm_clock unit tests. Signed-off-by: NAndrew Honig <ahonig@google.com> Signed-off-by: NGleb Natapov <gleb@redhat.com>
-
- 06 4月, 2013 1 次提交
-
-
由 Jan Beulich 提交于
eboot.o and efi_stub_$(BITS).o didn't get added to "targets", and hence their .cmd files don't get included by the build machinery, leading to the files always getting rebuilt. Rather than adding the two files individually, take the opportunity and add $(VMLINUX_OBJS) to "targets" instead, thus allowing the assignment at the top of the file to be shrunk quite a bit. At the same time, remove a pointless flags override line - the variable assigned to was misspelled anyway, and the options added are meaningless for assembly sources. [ hpa: the patch is not minimal, but I am taking it for -urgent anyway since the excess impact of the patch seems to be small enough. ] Signed-off-by: NJan Beulich <jbeulich@suse.com> Link: http://lkml.kernel.org/r/515C5D2502000078000CA6AD@nat28.tlf.novell.com Cc: Matthew Garrett <mjg@redhat.com> Cc: Matt Fleming <matt.fleming@intel.com> Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>
-
- 03 4月, 2013 1 次提交
-
-
由 Paul Moore 提交于
Commit fca460f9 simplified the x32 implementation by creating a syscall bitmask, equal to 0x40000000, that could be applied to x32 syscalls such that the masked syscall number would be the same as a x86_64 syscall. While that patch was a nice way to simplify the code, it went a bit too far by adding the mask to syscall_get_nr(); returning the masked syscall numbers can cause confusion with callers that expect syscall numbers matching the x32 ABI, e.g. unmasked syscall numbers. This patch fixes this by simply removing the mask from syscall_get_nr() while preserving the other changes from the original commit. While there are several syscall_get_nr() callers in the kernel, most simply check that the syscall number is greater than zero, in this case this patch will have no effect. Of those remaining callers, they appear to be few, seccomp and ftrace, and from my testing of seccomp without this patch the original commit definitely breaks things; the seccomp filter does not correctly filter the syscalls due to the difference in syscall numbers in the BPF filter and the value from syscall_get_nr(). Applying this patch restores the seccomp BPF filter functionality on x32. I've tested this patch with the seccomp BPF filters as well as ftrace and everything looks reasonable to me; needless to say general usage seemed fine as well. Signed-off-by: NPaul Moore <pmoore@redhat.com> Link: http://lkml.kernel.org/r/20130215172143.12549.10292.stgit@localhost Cc: <stable@vger.kernel.org> Cc: Will Drewry <wad@chromium.org> Cc: H. Peter Anvin <hpa@zytor.com> Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>
-
- 28 3月, 2013 1 次提交
-
-
由 Konrad Rzeszutek Wilk 提交于
We move the setting of write_cr3 from the early bootup variant (see git commit 0cc9129d "x86-64, xen, mmu: Provide an early version of write_cr3.") to a more appropiate location. This new location sets all of the other non-early variants of pvops calls - and most importantly is before the alternative_asm mechanism kicks in. Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
-
- 25 3月, 2013 1 次提交
-
-
由 Konrad Rzeszutek Wilk 提交于
They are defined in coreboot (MSR_PLATFORM) and the other one is already defined in msr-index.h. Let's use those. Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com> Acked-by: NViresh Kumar <viresh.kumar@linaro.org> Acked-by: NDirk Brandewie <dirk.j.brandewie@intel.com> Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
-
- 22 3月, 2013 2 次提交
-
-
由 Jan Beulich 提交于
For MSI-X capable devices the hypervisor wants to write protect the MSI-X table and PBA, yet it can't assume that resources have been assigned to their final values at device enumeration time. Thus have pciback do that notification, as having the device controlled by it is a prerequisite to assigning the device to guests anyway. This is the kernel part of hypervisor side commit 4245d33 ("x86/MSI: add mechanism to fully protect MSI-X table from PV guest accesses") on the master branch of git://xenbits.xen.org/xen.git. CC: stable@vger.kernel.org Signed-off-by: NJan Beulich <jbeulich@suse.com> Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
-
由 H. Peter Anvin 提交于
Add missing __cpuinit annotation to apply_microcode_early(). Reported-by: NShaun Ruffell <sruffell@digium.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Link: http://lkml.kernel.org/r/20130320170310.GA23362@digium.comSigned-off-by: NH. Peter Anvin <hpa@linux.intel.com>
-
- 20 3月, 2013 3 次提交
-
-
由 Fenghua Yu 提交于
In 32-bit, __pa_symbol() in CONFIG_DEBUG_VIRTUAL accesses kernel data (e.g. max_low_pfn) that not only hasn't been setup yet in such early boot phase, but since we are in linear mode, cannot even be detected as uninitialized. Thus, use __pa_nodebug() rather than __pa_symbol() to get a global symbol's physical address. Signed-off-by: NFenghua Yu <fenghua.yu@intel.com> Link: http://lkml.kernel.org/r/1363705484-27645-1-git-send-email-fenghua.yu@intel.comReported-and-tested-by: NTetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
-
由 Andy Honig 提交于
There is a potential use after free issue with the handling of MSR_KVM_SYSTEM_TIME. If the guest specifies a GPA in a movable or removable memory such as frame buffers then KVM might continue to write to that address even after it's removed via KVM_SET_USER_MEMORY_REGION. KVM pins the page in memory so it's unlikely to cause an issue, but if the user space component re-purposes the memory previously used for the guest, then the guest will be able to corrupt that memory. Tested: Tested against kvmclock unit test Signed-off-by: NAndrew Honig <ahonig@google.com> Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
-
由 Andy Honig 提交于
If the guest sets the GPA of the time_page so that the request to update the time straddles a page then KVM will write onto an incorrect page. The write is done byusing kmap atomic to get a pointer to the page for the time structure and then performing a memcpy to that page starting at an offset that the guest controls. Well behaved guests always provide a 32-byte aligned address, however a malicious guest could use this to corrupt host kernel memory. Tested: Tested against kvmclock unit test. Signed-off-by: NAndrew Honig <ahonig@google.com> Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
-
- 19 3月, 2013 2 次提交
-
-
由 Marcelo Tosatti 提交于
There is a deadlock in pvclock handling: cpu0: cpu1: kvm_gen_update_masterclock() kvm_guest_time_update() spin_lock(pvclock_gtod_sync_lock) local_irq_save(flags) spin_lock(pvclock_gtod_sync_lock) kvm_make_mclock_inprogress_request(kvm) make_all_cpus_request() smp_call_function_many() Now if smp_call_function_many() called by cpu0 tries to call function on cpu1 there will be a deadlock. Fix by moving pvclock_gtod_sync_lock protected section outside irq disabled section. Analyzed by Gleb Natapov <gleb@redhat.com> Acked-by: NGleb Natapov <gleb@redhat.com> Reported-and-Tested-by: NYongjie Ren <yongjie.ren@intel.com> Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
-
由 CQ Tang 提交于
The increment of "to" in copy_user_handle_tail() will have incremented before a failure has been noted. This causes us to skip a byte in the failure case. Only do the increment when assured there is no failure. Signed-off-by: NCQ Tang <cq.tang@intel.com> Link: http://lkml.kernel.org/r/20130318150221.8439.993.stgit@phlsvslse11.ph.intel.comSigned-off-by: NMike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com> Cc: <stable@vger.kernel.org>
-
- 18 3月, 2013 3 次提交
-
-
由 Stephane Eranian 提交于
Add scheduling constraints for SNB/SNB-EP CYCLE_ACTIVITY event as defined by SDM Jan 2013 edition. The STALLS umasks are combinations with the NO_DISPATCH umask. Signed-off-by: NStephane Eranian <eranian@gmail.com> Cc: peterz@infradead.org Cc: ak@linux.intel.com Cc: jolsa@redhat.com Link: http://lkml.kernel.org/r/20130317134957.GA8550@quadSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Masami Hiramatsu 提交于
Currently kprobes check whether the copied instruction modifies IF (interrupt flag) on each probe hit. This results not only in introducing overhead but also involving inat_get_opcode_attribute into the kprobes hot path, and it can cause an infinite recursive call (and kernel panic in the end). Actually, since the copied instruction itself can never be modified on the buffer, it is needless to analyze the instruction on every probe hit. To fix this issue, we check it only once when registering probe and store the result on ainsn->if_modifier. Reported-by: NTimo Juhani Lindfors <timo.lindfors@iki.fi> Signed-off-by: NMasami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Acked-by: NAnanth N Mavinakayanahalli <ananth@in.ibm.com> Cc: yrl.pp-manager.tt@hitachi.com Cc: Steven Rostedt <rostedt@goodmis.org> Cc: David S. Miller <davem@davemloft.net> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: http://lkml.kernel.org/r/20130314115242.19690.33573.stgit@mhiramat-M0-7522Signed-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Linus Torvalds 提交于
Commit 1d9d8639 ("perf,x86: fix kernel crash with PEBS/BTS after suspend/resume") fixed a crash when doing PEBS performance profiling after resuming, but in using init_debug_store_on_cpu() to restore the DS_AREA mtrr it also resulted in a new WARN_ON() triggering. init_debug_store_on_cpu() uses "wrmsr_on_cpu()", which in turn uses CPU cross-calls to do the MSR update. Which is not really valid at the early resume stage, and the warning is quite reasonable. Now, it all happens to _work_, for the simple reason that smp_call_function_single() ends up just doing the call directly on the CPU when the CPU number matches, but we really should just do the wrmsr() directly instead. This duplicates the wrmsr() logic, but hopefully we can just remove the wrmsr_on_cpu() version eventually. Reported-and-tested-by: NParag Warudkar <parag.lkml@gmail.com> Cc: stable@vger.kernel.org Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 16 3月, 2013 1 次提交
-
-
由 Stephane Eranian 提交于
This patch fixes a kernel crash when using precise sampling (PEBS) after a suspend/resume. Turns out the CPU notifier code is not invoked on CPU0 (BP). Therefore, the DS_AREA (used by PEBS) is not restored properly by the kernel and keeps it power-on/resume value of 0 causing any PEBS measurement to crash when running on CPU0. The workaround is to add a hook in the actual resume code to restore the DS Area MSR value. It is invoked for all CPUS. So for all but CPU0, the DS_AREA will be restored twice but this is harmless. Reported-by: NLinus Torvalds <torvalds@linux-foundation.org> Signed-off-by: NStephane Eranian <eranian@google.com> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 13 3月, 2013 1 次提交
-
-
由 Stephen Rothwell 提交于
In commit 887cbce0 ("arch Kconfig: centralise ARCH_NO_VIRT_TO_BUS") I introduced the config sybmol HAVE_VIRT_TO_BUS and selected that where needed. I am not sure what I was thinking. Instead, just directly select VIRT_TO_BUS where it is needed. Signed-off-by: NStephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 08 3月, 2013 1 次提交
-
-
由 Dave Hansen 提交于
kernel_map_sync_memtype() is called from a variety of contexts. The pat.c code that calls it seems to ensure that it is not called for non-ram areas by checking via pat_pagerange_is_ram(). It is important that it only be called on the actual identity map because there *IS* no map to sync for highmem pages, or for memory holes. The ioremap.c uses are not as careful as those from pat.c, and call kernel_map_sync_memtype() on PCI space which is in the middle of the kernel identity map _range_, but is not actually mapped. This patch adds a check to kernel_map_sync_memtype() which probably duplicates some of the checks already in pat.c. But, it is necessary for the ioremap.c uses and shouldn't hurt other callers. I have reproduced this bug and this patch fixes it for me and the original bug reporter: https://lkml.org/lkml/2013/2/5/396Signed-off-by: NDave Hansen <dave@linux.vnet.ibm.com> Link: http://lkml.kernel.org/r/20130307163151.D9B58C4E@kernel.stglabs.ibm.comSigned-off-by: NDave Hansen <dave@sr71.net> Tested-by: NTetsuo Handa <penguin-kernel@i-love.sakura.ne.jp> Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
-
- 07 3月, 2013 4 次提交
-
-
由 Peter Jones 提交于
If the sentinel triggers, we do not want the boot loader authors to just poke it and make the error go away, we want them to actually fix the problem. This should help avoid making the incorrect change in non-compliant bootloaders. [ hpa: dropped the Documentation/x86/boot.txt hunk pending clarifications ] Signed-off-by: NPeter Jones <pjones@redhat.com> Link: http://lkml.kernel.org/r/1362592823-28967-1-git-send-email-pjones@redhat.comSigned-off-by: NH. Peter Anvin <hpa@zytor.com>
-
由 Josh Boyer 提交于
When boot_params->sentinel is set, all we really know is that some undefined set of fields in struct boot_params contain garbage. In the particular case of efi_info, however, there is a private magic for that substructure, so it is generally safe to leave it even if the bootloader is broken. kexec (for which we did the initial analysis) did not initialize this field, but of course all the EFI bootloaders do, and most EFI bootloaders are broken in this respect (and should be fixed.) Reported-by: NRobin Holt <holt@sgi.com> Link: http://lkml.kernel.org/r/CA%2B5PVA51-FT14p4CRYKbicykugVb=PiaEycdQ57CK2km_OQuRQ@mail.gmail.comTested-by: NJosh Boyer <jwboyer@gmail.com> Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
-
由 Yinghai Lu 提交于
Henrik reported that his MacAir 3.1 would not boot with | commit 8d57470d | Date: Fri Nov 16 19:38:58 2012 -0800 | | x86, mm: setup page table in top-down It turns out that we do not calculate the real_end properly: We try to get 2M size with 4K alignment, and later will round down to 2M, so we will get less then 2M for first mapping, in extreme case could be only 4K only. In Henrik's system it has (1M-32K) as last usable rage is [mem 0x7f9db000-0x7fef8fff]. The problem is exposed when EFI booting have several holes and it will force mapping to use PTE instead as we only map usable areas. To fix it, just make it be 2M aligned, so we can be guaranteed to be able to use large pages to map it. Reported-by: NHenrik Rydberg <rydberg@euromail.se> Bisected-by: NHenrik Rydberg <rydberg@euromail.se> Tested-by: NHenrik Rydberg <rydberg@euromail.se> Signed-off-by: NYinghai Lu <yinghai@kernel.org> Link: http://lkml.kernel.org/r/CAE9FiQX4nQ7_1kg5RL_vh56rmcSHXUi1ExrZX7CwED4NGMnHfg@mail.gmail.comSigned-off-by: NH. Peter Anvin <hpa@zytor.com>
-
由 Krzysztof Mazur 提交于
The commit 27be4570 ('x86 idle: remove 32-bit-only "no-hlt" parameter, hlt_works_ok flag') removed the hlt_works_ok flag from struct cpuinfo_x86, but boot_cpu_data and new_cpu_data initializers were not changed causing setting f00f_bug flag, instead of fdiv_bug. If CONFIG_X86_F00F_BUG is not set the f00f_bug flag is never cleared. To avoid such problems in future C99-style initialization is now used. Signed-off-by: NKrzysztof Mazur <krzysiek@podlesie.net> Acked-by: NBorislav Petkov <bp@suse.de> Cc: len.brown@intel.com Link: http://lkml.kernel.org/r/1362266082-2227-1-git-send-email-krzysiek@podlesie.netSigned-off-by: NIngo Molnar <mingo@kernel.org> Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
-
- 06 3月, 2013 1 次提交
-
-
由 Borislav Petkov 提交于
The cpuinfo_x86 ptr is unused now. Drop it. Got obsolete by 69fb3676 ("x86 idle: remove mwait_idle() and "idle=mwait" cmdline param") removing its only user. [ hpa: fixes gcc warning ] Signed-off-by: NBorislav Petkov <bp@suse.de> Link: http://lkml.kernel.org/r/1362428180-8865-2-git-send-email-bp@alien8.de Cc: Len Brown <len.brown@intel.com> Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>
-
- 03 3月, 2013 1 次提交
-
-
由 Yinghai Lu 提交于
Tim found: WARNING: at arch/x86/kernel/smpboot.c:324 topology_sane.isra.2+0x6f/0x80() Hardware name: S2600CP sched: CPU #1's llc-sibling CPU #0 is not on the same node! [node: 1 != 0]. Ignoring dependency. smpboot: Booting Node 1, Processors #1 Modules linked in: Pid: 0, comm: swapper/1 Not tainted 3.9.0-0-generic #1 Call Trace: set_cpu_sibling_map+0x279/0x449 start_secondary+0x11d/0x1e5 Don Morris reproduced on a HP z620 workstation, and bisected it to commit e8d19552 ("acpi, memory-hotplug: parse SRAT before memblock is ready") It turns out movable_map has some problems, and it breaks several things 1. numa_init is called several times, NOT just for srat. so those nodes_clear(numa_nodes_parsed) memset(&numa_meminfo, 0, sizeof(numa_meminfo)) can not be just removed. Need to consider sequence is: numaq, srat, amd, dummy. and make fall back path working. 2. simply split acpi_numa_init to early_parse_srat. a. that early_parse_srat is NOT called for ia64, so you break ia64. b. for (i = 0; i < MAX_LOCAL_APIC; i++) set_apicid_to_node(i, NUMA_NO_NODE) still left in numa_init. So it will just clear result from early_parse_srat. it should be moved before that.... c. it breaks ACPI_TABLE_OVERIDE...as the acpi table scan is moved early before override from INITRD is settled. 3. that patch TITLE is total misleading, there is NO x86 in the title, but it changes critical x86 code. It caused x86 guys did not pay attention to find the problem early. Those patches really should be routed via tip/x86/mm. 4. after that commit, following range can not use movable ram: a. real_mode code.... well..funny, legacy Node0 [0,1M) could be hot-removed? b. initrd... it will be freed after booting, so it could be on movable... c. crashkernel for kdump...: looks like we can not put kdump kernel above 4G anymore. d. init_mem_mapping: can not put page table high anymore. e. initmem_init: vmemmap can not be high local node anymore. That is not good. If node is hotplugable, the mem related range like page table and vmemmap could be on the that node without problem and should be on that node. We have workaround patch that could fix some problems, but some can not be fixed. So just remove that offending commit and related ones including: f7210e6c ("mm/memblock.c: use CONFIG_HAVE_MEMBLOCK_NODE_MAP to protect movablecore_map in memblock_overlaps_region().") 01a178a9 ("acpi, memory-hotplug: support getting hotplug info from SRAT") 27168d38 ("acpi, memory-hotplug: extend movablemem_map ranges to the end of node") e8d19552 ("acpi, memory-hotplug: parse SRAT before memblock is ready") fb06bc8e ("page_alloc: bootmem limit with movablecore_map") 42f47e27 ("page_alloc: make movablemem_map have higher priority") 6981ec31 ("page_alloc: introduce zone_movable_limit[] to keep movable limit for nodes") 34b71f1e ("page_alloc: add movable_memmap kernel parameter") 4d59a751 ("x86: get pg_data_t's memory from other node") Later we should have patches that will make sure kernel put page table and vmemmap on local node ram instead of push them down to node0. Also need to find way to put other kernel used ram to local node ram. Reported-by: NTim Gardner <tim.gardner@canonical.com> Reported-by: NDon Morris <don.morris@hp.com> Bisected-by: NDon Morris <don.morris@hp.com> Tested-by: NDon Morris <don.morris@hp.com> Signed-off-by: NYinghai Lu <yinghai@kernel.org> Cc: Tony Luck <tony.luck@intel.com> Cc: Thomas Renninger <trenn@suse.de> Cc: Tejun Heo <tj@kernel.org> Cc: Tang Chen <tangchen@cn.fujitsu.com> Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 01 3月, 2013 1 次提交
-
-
由 Konrad Rzeszutek Wilk 提交于
There is no hypercall to setup multiple MSI per PCI device. As such with these two new commits: - 08261d87 PCI/MSI: Enable multiple MSIs with pci_enable_msi_block_auto() - 5ca72c4f AHCI: Support multiple MSIs we would call the PHYSDEVOP_map_pirq 'nvec' times with the same contents of the PCI device. Sander discovered that we would get the same PIRQ value 'nvec' times and return said values to the caller. That of course meant that the device was configured only with one MSI and AHCI would fail with: ahci 0000:00:11.0: version 3.0 xen: registering gsi 19 triggering 0 polarity 1 xen: --> pirq=19 -> irq=19 (gsi=19) (XEN) [2013-02-27 19:43:07] IOAPIC[0]: Set PCI routing entry (6-19 -> 0x99 -> IRQ 19 Mode:1 Active:1) ahci 0000:00:11.0: AHCI 0001.0200 32 slots 4 ports 6 Gbps 0xf impl SATA mode ahci 0000:00:11.0: flags: 64bit ncq sntf ilck pm led clo pmp pio slum part ahci: probe of 0000:00:11.0 failed with error -22 That is b/c in ahci_host_activate the second call to devm_request_threaded_irq would return -EINVAL as we passed in (on the second run) an IRQ that was never initialized. CC: stable@vger.kernel.org Reported-and-Tested-by: NSander Eikelenboom <linux@eikelenboom.it> Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
-
- 28 2月, 2013 6 次提交
-
-
由 Konrad Rzeszutek Wilk 提交于
The git commit 8eaffa67 (xen/pat: Disable PAT support for now) explains in details why we want to disable PAT for right now. However that change was not enough and we should have also disabled the pat_enabled value. Otherwise we end up with: mmap-example:3481 map pfn expected mapping type write-back for [mem 0x00010000-0x00010fff], got uncached-minus ------------[ cut here ]------------ WARNING: at /build/buildd/linux-3.8.0/arch/x86/mm/pat.c:774 untrack_pfn+0xb8/0xd0() mem 0x00010000-0x00010fff], got uncached-minus ------------[ cut here ]------------ WARNING: at /build/buildd/linux-3.8.0/arch/x86/mm/pat.c:774 untrack_pfn+0xb8/0xd0() ... Pid: 3481, comm: mmap-example Tainted: GF 3.8.0-6-generic #13-Ubuntu Call Trace: [<ffffffff8105879f>] warn_slowpath_common+0x7f/0xc0 [<ffffffff810587fa>] warn_slowpath_null+0x1a/0x20 [<ffffffff8104bcc8>] untrack_pfn+0xb8/0xd0 [<ffffffff81156c1c>] unmap_single_vma+0xac/0x100 [<ffffffff81157459>] unmap_vmas+0x49/0x90 [<ffffffff8115f808>] exit_mmap+0x98/0x170 [<ffffffff810559a4>] mmput+0x64/0x100 [<ffffffff810560f5>] dup_mm+0x445/0x660 [<ffffffff81056d9f>] copy_process.part.22+0xa5f/0x1510 [<ffffffff81057931>] do_fork+0x91/0x350 [<ffffffff81057c76>] sys_clone+0x16/0x20 [<ffffffff816ccbf9>] stub_clone+0x69/0x90 [<ffffffff816cc89d>] ? system_call_fastpath+0x1a/0x1f ---[ end trace 4918cdd0a4c9fea4 ]--- (a similar message shows up if you end up launching 'mcelog') The call chain is (as analyzed by Liu, Jinsong): do_fork --> copy_process --> dup_mm --> dup_mmap --> copy_page_range --> track_pfn_copy --> reserve_pfn_range --> line 624: flags != want_flags It comes from different memory types of page table (_PAGE_CACHE_WB) and MTRR (_PAGE_CACHE_UC_MINUS). Stefan Bader dug in this deep and found out that: "That makes it clearer as this will do reserve_memtype(...) --> pat_x_mtrr_type --> mtrr_type_lookup --> __mtrr_type_lookup And that can return -1/0xff in case of MTRR not being enabled/initialized. Which is not the case (given there are no messages for it in dmesg). This is not equal to MTRR_TYPE_WRBACK and thus becomes _PAGE_CACHE_UC_MINUS. It looks like the problem starts early in reserve_memtype: if (!pat_enabled) { /* This is identical to page table setting without PAT */ if (new_type) { if (req_type == _PAGE_CACHE_WC) *new_type = _PAGE_CACHE_UC_MINUS; else *new_type = req_type & _PAGE_CACHE_MASK; } return 0; } This would be what we want, that is clearing the PWT and PCD flags from the supported flags - if pat_enabled is disabled." This patch does that - disabling PAT. CC: stable@vger.kernel.org # 3.3 and further Reported-by: NSander Eikelenboom <linux@eikelenboom.it> Reported-and-Tested-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com> Reported-and-Tested-by: NStefan Bader <stefan.bader@canonical.com> Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
-
由 Peter Hurley 提交于
The physical memory fixmapped for the pvclock clock_gettime vsyscall was allocated, and thus is not a kernel symbol. __pa() is the proper method to use in this case. Fixes the crash below when booting a next-20130204+ smp guest on a 3.8-rc5+ KVM host. [ 0.666410] udevd[97]: starting version 175 [ 0.674043] udevd[97]: udevd:[97]: segfault at ffffffffff5fd020 ip 00007fff069e277f sp 00007fff068c9ef8 error d Acked-by: NMarcelo Tosatti <mtosatti@redhat.com> Signed-off-by: NPeter Hurley <peter@hurleysoftware.com> Signed-off-by: NGleb Natapov <gleb@redhat.com>
-
由 Sasha Levin 提交于
I'm not sure why, but the hlist for each entry iterators were conceived list_for_each_entry(pos, head, member) The hlist ones were greedy and wanted an extra parameter: hlist_for_each_entry(tpos, pos, head, member) Why did they need an extra pos parameter? I'm not quite sure. Not only they don't really need it, it also prevents the iterator from looking exactly like the list iterator, which is unfortunate. Besides the semantic patch, there was some manual work required: - Fix up the actual hlist iterators in linux/list.h - Fix up the declaration of other iterators based on the hlist ones. - A very small amount of places were using the 'node' parameter, this was modified to use 'obj->member' instead. - Coccinelle didn't handle the hlist_for_each_entry_safe iterator properly, so those had to be fixed up manually. The semantic patch which is mostly the work of Peter Senna Tschudin is here: @@ iterator name hlist_for_each_entry, hlist_for_each_entry_continue, hlist_for_each_entry_from, hlist_for_each_entry_rcu, hlist_for_each_entry_rcu_bh, hlist_for_each_entry_continue_rcu_bh, for_each_busy_worker, ax25_uid_for_each, ax25_for_each, inet_bind_bucket_for_each, sctp_for_each_hentry, sk_for_each, sk_for_each_rcu, sk_for_each_from, sk_for_each_safe, sk_for_each_bound, hlist_for_each_entry_safe, hlist_for_each_entry_continue_rcu, nr_neigh_for_each, nr_neigh_for_each_safe, nr_node_for_each, nr_node_for_each_safe, for_each_gfn_indirect_valid_sp, for_each_gfn_sp, for_each_host; type T; expression a,c,d,e; identifier b; statement S; @@ -T b; <+... when != b ( hlist_for_each_entry(a, - b, c, d) S | hlist_for_each_entry_continue(a, - b, c) S | hlist_for_each_entry_from(a, - b, c) S | hlist_for_each_entry_rcu(a, - b, c, d) S | hlist_for_each_entry_rcu_bh(a, - b, c, d) S | hlist_for_each_entry_continue_rcu_bh(a, - b, c) S | for_each_busy_worker(a, c, - b, d) S | ax25_uid_for_each(a, - b, c) S | ax25_for_each(a, - b, c) S | inet_bind_bucket_for_each(a, - b, c) S | sctp_for_each_hentry(a, - b, c) S | sk_for_each(a, - b, c) S | sk_for_each_rcu(a, - b, c) S | sk_for_each_from -(a, b) +(a) S + sk_for_each_from(a) S | sk_for_each_safe(a, - b, c, d) S | sk_for_each_bound(a, - b, c) S | hlist_for_each_entry_safe(a, - b, c, d, e) S | hlist_for_each_entry_continue_rcu(a, - b, c) S | nr_neigh_for_each(a, - b, c) S | nr_neigh_for_each_safe(a, - b, c, d) S | nr_node_for_each(a, - b, c) S | nr_node_for_each_safe(a, - b, c, d) S | - for_each_gfn_sp(a, c, d, b) S + for_each_gfn_sp(a, c, d) S | - for_each_gfn_indirect_valid_sp(a, c, d, b) S + for_each_gfn_indirect_valid_sp(a, c, d) S | for_each_host(a, - b, c) S | for_each_host_safe(a, - b, c, d) S | for_each_mesh_entry(a, - b, c, d) S ) ...+> [akpm@linux-foundation.org: drop bogus change from net/ipv4/raw.c] [akpm@linux-foundation.org: drop bogus hunk from net/ipv6/raw.c] [akpm@linux-foundation.org: checkpatch fixes] [akpm@linux-foundation.org: fix warnings] [akpm@linux-foudnation.org: redo intrusive kvm changes] Tested-by: NPeter Senna Tschudin <peter.senna@gmail.com> Acked-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: NSasha Levin <sasha.levin@oracle.com> Cc: Wu Fengguang <fengguang.wu@intel.com> Cc: Marcelo Tosatti <mtosatti@redhat.com> Cc: Gleb Natapov <gleb@redhat.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Stephen Rothwell 提交于
Change it to CONFIG_HAVE_VIRT_TO_BUS and set it in all architecures that already provide virt_to_bus(). Signed-off-by: NStephen Rothwell <sfr@canb.auug.org.au> Reviewed-by: NJames Hogan <james.hogan@imgtec.com> Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: H Hartley Sweeten <hartleys@visionengravers.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: "David S. Miller" <davem@davemloft.net> Cc: Paul Mundt <lethal@linux-sh.org> Cc: Vineet Gupta <Vineet.Gupta1@synopsys.com> Cc: James Bottomley <James.Bottomley@HansenPartnership.com> Cc: <linux-arch@vger.kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Al Viro 提交于
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
由 H. Peter Anvin 提交于
On non-BIOS platforms it is possible that the BIOS data area contains garbage instead of being zeroed or something equivalent (firmware people: we are talking of 1.5K here, so please do the sane thing.) We need on the order of 20-30K of low memory in order to boot, which may grow up to < 64K in the future. We probably want to avoid the lowest of the low memory. At the same time, it seems extremely unlikely that a legitimate EBDA would ever reach down to the 128K (which would require it to be over half a megabyte in size.) Thus, pick 128K as the cutoff for "this is insane, ignore." We may still end up reserving a bunch of extra memory on the low megabyte, but that is not really a major issue these days. In the worst case we lose 512K of RAM. This code really should be merged with trim_bios_range() in arch/x86/kernel/setup.c, but that is a bigger patch for a later merge window. Reported-by: NDarren Hart <dvhart@linux.intel.com> Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com> Cc: Matt Fleming <matt.fleming@intel.com> Cc: <stable@vger.kernel.org> Link: http://lkml.kernel.org/n/tip-oebml055yyfm8yxmria09rja@git.kernel.org
-
- 26 2月, 2013 1 次提交
-
-
由 Matt Fleming 提交于
disable_runtime is only referenced from __init functions, so mark it as __initdata. Reported-by: NYinghai Lu <yinghai@kernel.org> Reviewed-by: NSatoru Takeuchi <takeuchi_satoru@jp.fujitsu.com> Signed-off-by: NMatt Fleming <matt.fleming@intel.com> Link: http://lkml.kernel.org/r/1361545427-26393-1-git-send-email-matt@console-pimps.orgSigned-off-by: NH. Peter Anvin <hpa@linux.intel.com>
-