- 19 3月, 2014 8 次提交
-
-
由 Stefani Seibold 提交于
This patch add the time support for 32 bit a VDSO to a 32 bit kernel. For 32 bit programs running on a 32 bit kernel, the same mechanism is used as for 64 bit programs running on a 64 bit kernel. Reviewed-by: NAndy Lutomirski <luto@amacapital.net> Signed-off-by: NStefani Seibold <stefani@seibold.net> Link: http://lkml.kernel.org/r/1395094933-14252-10-git-send-email-stefani@seibold.netSigned-off-by: NH. Peter Anvin <hpa@linux.intel.com>
-
由 Andy Lutomirski 提交于
We need the alternatives mechanism for rdtsc_barrier() to work. Signed-off-by: NStefani Seibold <stefani@seibold.net> Link: http://lkml.kernel.org/r/1395094933-14252-9-git-send-email-stefani@seibold.netSigned-off-by: NH. Peter Anvin <hpa@linux.intel.com>
-
由 Stefani Seibold 提交于
This patch revamps the vvar.h for introduce the VVAR macro for vdso32. Reviewed-by: NAndy Lutomirski <luto@amacapital.net> Signed-off-by: NStefani Seibold <stefani@seibold.net> Link: http://lkml.kernel.org/r/1395094933-14252-8-git-send-email-stefani@seibold.netSigned-off-by: NH. Peter Anvin <hpa@linux.intel.com>
-
由 Stefani Seibold 提交于
This patch cleans up the __vdso_gettimeofday() function a little. It kicks out an unneeded ret local variable and makes the code faster if only the timezone is needed (an admittedly rare case.) Reviewed-by: NAndy Lutomirski <luto@amacapital.net> Signed-off-by: NStefani Seibold <stefani@seibold.net> Link: http://lkml.kernel.org/r/1395094933-14252-7-git-send-email-stefani@seibold.netSigned-off-by: NH. Peter Anvin <hpa@linux.intel.com>
-
由 Stefani Seibold 提交于
There a currently more than 30 users of the gtod macro, so replace the last VVAR(vsyscall_gtod_data) by gtod macro. Reviewed-by: NAndy Lutomirski <luto@amacapital.net> Signed-off-by: NStefani Seibold <stefani@seibold.net> Link: http://lkml.kernel.org/r/1395094933-14252-6-git-send-email-stefani@seibold.netSigned-off-by: NH. Peter Anvin <hpa@linux.intel.com>
-
由 Stefani Seibold 提交于
This patch is a small code cleanup for the __vdso_clock_gettime() function. It removes the unneeded return values from do_monotonic_coarse() and do_realtime_coarse() and add a fallback label for doing the kernel gettimeofday() system call. Reviewed-by: NAndy Lutomirski <luto@amacapital.net> Signed-off-by: NStefani Seibold <stefani@seibold.net> Link: http://lkml.kernel.org/r/1395094933-14252-5-git-send-email-stefani@seibold.netSigned-off-by: NH. Peter Anvin <hpa@linux.intel.com>
-
由 Stefani Seibold 提交于
This intermediate patch revamps the vclock_gettime.c by moving some functions around. It is only for spliting purpose, to make whole the 32 bit vdso timer patch easier to review. Reviewed-by: NAndy Lutomirski <luto@amacapital.net> Signed-off-by: NStefani Seibold <stefani@seibold.net> Link: http://lkml.kernel.org/r/1395094933-14252-4-git-send-email-stefani@seibold.netSigned-off-by: NH. Peter Anvin <hpa@linux.intel.com>
-
由 Stefani Seibold 提交于
This patch move the vsyscall_gtod_data handling out of vsyscall_64.c into an additonal file vsyscall_gtod.c to make the functionality available for x86 32 bit kernel. It also adds a new vsyscall_32.c which setup the VVAR page. Reviewed-by: NAndy Lutomirski <luto@amacapital.net> Signed-off-by: NStefani Seibold <stefani@seibold.net> Link: http://lkml.kernel.org/r/1395094933-14252-2-git-send-email-stefani@seibold.netSigned-off-by: NH. Peter Anvin <hpa@linux.intel.com>
-
- 14 3月, 2014 3 次提交
-
-
由 H. Peter Anvin 提交于
Checkin b0b49f26 x86, vdso: Remove compat vdso support ... removed the VDSO from the fixmap, and thus FIX_VDSO; remove a stray reference in Xen. Found by Fengguang Wu's test robot. Reported-by: NFengguang Wu <fengguang.wu@intel.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: David Vrabel <david.vrabel@citrix.com> Link: http://lkml.kernel.org/r/4bb4690899106eb11430b1186d5cc66ca9d1660c.1394751608.git.luto@amacapital.netSigned-off-by: NH. Peter Anvin <hpa@linux.intel.com>
-
由 Andy Lutomirski 提交于
The only reason that the user bit was set was to support userspace access to the compat vDSO in the fixmap. The compat vDSO is gone, so the user bit can be removed. Signed-off-by: NAndy Lutomirski <luto@amacapital.net> Link: http://lkml.kernel.org/r/e240a977f3c7cbd525a091fd6521499ec4b8e94f.1394751608.git.luto@amacapital.netSigned-off-by: NH. Peter Anvin <hpa@linux.intel.com>
-
由 Andy Lutomirski 提交于
The compat vDSO is a complicated hack that's needed to maintain compatibility with a small range of glibc versions. This removes it and replaces it with a much simpler hack: a config option to disable the 32-bit vDSO by default. This also changes the default value of CONFIG_COMPAT_VDSO to n -- users configuring kernels from scratch almost certainly want that choice. Signed-off-by: NAndy Lutomirski <luto@amacapital.net> Link: http://lkml.kernel.org/r/4bb4690899106eb11430b1186d5cc66ca9d1660c.1394751608.git.luto@amacapital.netSigned-off-by: NH. Peter Anvin <hpa@linux.intel.com>
-
- 08 3月, 2014 2 次提交
-
-
由 Linus Torvalds 提交于
It's an enum, not a #define, you can't use it in asm files. Introduced in commit 5fa10196 ("x86: Ignore NMIs that come in during early boot"), and sadly I didn't compile-test things like I should have before pushing out. My weak excuse is that the x86 tree generally doesn't introduce stupid things like this (and the ARM pull afterwards doesn't cause me to do a compile-test either, since I don't cross-compile). Cc: Don Zickus <dzickus@redhat.com> Cc: H. Peter Anvin <hpa@linux.intel.com> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 H. Peter Anvin 提交于
Don Zickus reports: A customer generated an external NMI using their iLO to test kdump worked. Unfortunately, the machine hung. Disabling the nmi_watchdog made things work. I speculated the external NMI fired, caused the machine to panic (as expected) and the perf NMI from the watchdog came in and was latched. My guess was this somehow caused the hang. ---- It appears that the latched NMI stays latched until the early page table generation on 64 bits, which causes exceptions to happen which end in IRET, which re-enable NMI. Therefore, ignore NMIs that come in during early execution, until we have proper exception handling. Reported-and-tested-by: NDon Zickus <dzickus@redhat.com> Link: http://lkml.kernel.org/r/1394221143-29713-1-git-send-email-dzickus@redhat.comSigned-off-by: NH. Peter Anvin <hpa@linux.intel.com> Cc: <stable@vger.kernel.org> # v3.5+, older with some backport effort
-
- 07 3月, 2014 1 次提交
-
-
由 Peter Zijlstra 提交于
Building on commit 0ac09f9f ("x86, trace: Fix CR2 corruption when tracing page faults") this patch addresses another few issues: - Now that read_cr2() is lifted into trace_do_page_fault(), we should pass the address to trace_page_fault_entries() to avoid it re-reading a potentially changed cr2. - Put both trace_do_page_fault() and trace_page_fault_entries() under CONFIG_TRACING. - Mark both fault entry functions {,trace_}do_page_fault() as notrace to avoid getting __mcount or other function entry trace callbacks before we've observed CR2. - Mark __do_page_fault() as noinline to guarantee the function tracer does get to see the fault. Cc: <jolsa@redhat.com> Cc: <vincent.weaver@maine.edu> Acked-by: NSteven Rostedt <rostedt@goodmis.org> Signed-off-by: NPeter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20140306145300.GO9987@twins.programming.kicks-ass.netSigned-off-by: NH. Peter Anvin <hpa@linux.intel.com>
-
- 05 3月, 2014 2 次提交
-
-
由 Jiri Olsa 提交于
The trace_do_page_fault function trigger tracepoint and then handles the actual page fault. This could lead to error if the tracepoint caused page fault. The original cr2 value gets lost and the original page fault handler kills current process with SIGSEGV. This happens if you record page faults with callchain data, the user part of it will cause tracepoint handler to page fault: # perf record -g -e exceptions:page_fault_user ls Fixing this by saving the original cr2 value and using it after tracepoint handler is done. v2: Moving the cr2 read before exception_enter, because it could trigger tracepoint as well. Reported-by: NArnaldo Carvalho de Melo <acme@ghostprotocols.net> Reported-by: NVince Weaver <vincent.weaver@maine.edu> Tested-by: NVince Weaver <vincent.weaver@maine.edu> Acked-by: NSteven Rostedt <rostedt@goodmis.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Cc: Seiji Aguchi <seiji.aguchi@hds.com> Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com> Link: http://lkml.kernel.org/r/alpine.DEB.2.10.1402211701380.6395@vincent-weaver-1.um.maine.edu Link: http://lkml.kernel.org/r/20140228160526.GD1133@krava.brq.redhat.com
-
由 Borislav Petkov 提交于
Alex reported hitting the following BUG after the EFI 1:1 virtual mapping work was merged, kernel BUG at arch/x86/mm/init_64.c:351! invalid opcode: 0000 [#1] SMP Call Trace: [<ffffffff818aa71d>] init_extra_mapping_uc+0x13/0x15 [<ffffffff818a5e20>] uv_system_init+0x22b/0x124b [<ffffffff8108b886>] ? clockevents_register_device+0x138/0x13d [<ffffffff81028dbb>] ? setup_APIC_timer+0xc5/0xc7 [<ffffffff8108b620>] ? clockevent_delta2ns+0xb/0xd [<ffffffff818a3a92>] ? setup_boot_APIC_clock+0x4a8/0x4b7 [<ffffffff8153d955>] ? printk+0x72/0x74 [<ffffffff818a1757>] native_smp_prepare_cpus+0x389/0x3d6 [<ffffffff818957bc>] kernel_init_freeable+0xb7/0x1fb [<ffffffff81535530>] ? rest_init+0x74/0x74 [<ffffffff81535539>] kernel_init+0x9/0xff [<ffffffff81541dfc>] ret_from_fork+0x7c/0xb0 [<ffffffff81535530>] ? rest_init+0x74/0x74 Getting this thing to work with the new mapping scheme would need more work, so automatically switch to the old memmap layout for SGI UV. Acked-by: NRuss Anderson <rja@sgi.com> Cc: Alex Thorlton <athorlton@sgi.com Signed-off-by: NBorislav Petkov <bp@suse.de> Signed-off-by: NMatt Fleming <matt.fleming@intel.com>
-
- 28 2月, 2014 2 次提交
-
-
由 Paolo Bonzini 提交于
Commit e504c909 (kvm, vmx: Fix lazy FPU on nested guest, 2013-11-13) highlighted a real problem, but the fix was subtly wrong. nested_read_cr0 is the CR0 as read by L2, but here we want to look at the CR0 value reflecting L1's setup. In other words, L2 might think that TS=0 (so nested_read_cr0 has the bit clear); but if L1 is actually running it with TS=1, we should inject the fault into L1. The effective value of CR0 in L2 is contained in vmcs12->guest_cr0, use it. Fixes: e504c909Reported-by: NKashyap Chamarty <kchamart@redhat.com> Reported-by: NStefan Bader <stefan.bader@canonical.com> Tested-by: NKashyap Chamarty <kchamart@redhat.com> Tested-by: NAnthoine Bourgeois <bourgeois@bertin.fr> Cc: stable@vger.kernel.org Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Andrew Honig 提交于
The problem occurs when the guest performs a pusha with the stack address pointing to an mmio address (or an invalid guest physical address) to start with, but then extending into an ordinary guest physical address. When doing repeated emulated pushes emulator_read_write sets mmio_needed to 1 on the first one. On a later push when the stack points to regular memory, mmio_nr_fragments is set to 0, but mmio_is_needed is not set to 0. As a result, KVM exits to userspace, and then returns to complete_emulated_mmio. In complete_emulated_mmio vcpu->mmio_cur_fragment is incremented. The termination condition of vcpu->mmio_cur_fragment == vcpu->mmio_nr_fragments is never achieved. The code bounces back and fourth to userspace incrementing mmio_cur_fragment past it's buffer. If the guest does nothing else it eventually leads to a a crash on a memcpy from invalid memory address. However if a guest code can cause the vm to be destroyed in another vcpu with excellent timing, then kvm_clear_async_pf_completion_queue can be used by the guest to control the data that's pointed to by the call to cancel_work_item, which can be used to gain execution. Fixes: f78146b0Signed-off-by: NAndrew Honig <ahonig@google.com> Cc: stable@vger.kernel.org (3.5+) Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 27 2月, 2014 2 次提交
-
-
由 Peter Zijlstra 提交于
Vince "Super Tester" Weaver reported a new round of syscall fuzzing (Trinity) failures, with perf WARN_ON()s triggering. He also provided traces of the failures. This is I think the relevant bit: > pec_1076_warn-2804 [000] d... 147.926153: x86_pmu_disable: x86_pmu_disable > pec_1076_warn-2804 [000] d... 147.926153: x86_pmu_state: Events: { > pec_1076_warn-2804 [000] d... 147.926156: x86_pmu_state: 0: state: .R config: ffffffffffffffff ( (null)) > pec_1076_warn-2804 [000] d... 147.926158: x86_pmu_state: 33: state: AR config: 0 (ffff88011ac99800) > pec_1076_warn-2804 [000] d... 147.926159: x86_pmu_state: } > pec_1076_warn-2804 [000] d... 147.926160: x86_pmu_state: n_events: 1, n_added: 0, n_txn: 1 > pec_1076_warn-2804 [000] d... 147.926161: x86_pmu_state: Assignment: { > pec_1076_warn-2804 [000] d... 147.926162: x86_pmu_state: 0->33 tag: 1 config: 0 (ffff88011ac99800) > pec_1076_warn-2804 [000] d... 147.926163: x86_pmu_state: } > pec_1076_warn-2804 [000] d... 147.926166: collect_events: Adding event: 1 (ffff880119ec8800) So we add the insn:p event (fd[23]). At this point we should have: n_events = 2, n_added = 1, n_txn = 1 > pec_1076_warn-2804 [000] d... 147.926170: collect_events: Adding event: 0 (ffff8800c9e01800) > pec_1076_warn-2804 [000] d... 147.926172: collect_events: Adding event: 4 (ffff8800cbab2c00) We try and add the {BP,cycles,br_insn} group (fd[3], fd[4], fd[15]). These events are 0:cycles and 4:br_insn, the BP event isn't x86_pmu so that's not visible. group_sched_in() pmu->start_txn() /* nop - BP pmu */ event_sched_in() event->pmu->add() So here we should end up with: 0: n_events = 3, n_added = 2, n_txn = 2 4: n_events = 4, n_added = 3, n_txn = 3 But seeing the below state on x86_pmu_enable(), the must have failed, because the 0 and 4 events aren't there anymore. Looking at group_sched_in(), since the BP is the leader, its event_sched_in() must have succeeded, for otherwise we would not have seen the sibling adds. But since neither 0 or 4 are in the below state; their event_sched_in() must have failed; but I don't see why, the complete state: 0,0,1:p,4 fits perfectly fine on a core2. However, since we try and schedule 4 it means the 0 event must have succeeded! Therefore the 4 event must have failed, its failure will have put group_sched_in() into the fail path, which will call: event_sched_out() event->pmu->del() on 0 and the BP event. Now x86_pmu_del() will reduce n_events; but it will not reduce n_added; giving what we see below: n_event = 2, n_added = 2, n_txn = 2 > pec_1076_warn-2804 [000] d... 147.926177: x86_pmu_enable: x86_pmu_enable > pec_1076_warn-2804 [000] d... 147.926177: x86_pmu_state: Events: { > pec_1076_warn-2804 [000] d... 147.926179: x86_pmu_state: 0: state: .R config: ffffffffffffffff ( (null)) > pec_1076_warn-2804 [000] d... 147.926181: x86_pmu_state: 33: state: AR config: 0 (ffff88011ac99800) > pec_1076_warn-2804 [000] d... 147.926182: x86_pmu_state: } > pec_1076_warn-2804 [000] d... 147.926184: x86_pmu_state: n_events: 2, n_added: 2, n_txn: 2 > pec_1076_warn-2804 [000] d... 147.926184: x86_pmu_state: Assignment: { > pec_1076_warn-2804 [000] d... 147.926186: x86_pmu_state: 0->33 tag: 1 config: 0 (ffff88011ac99800) > pec_1076_warn-2804 [000] d... 147.926188: x86_pmu_state: 1->0 tag: 1 config: 1 (ffff880119ec8800) > pec_1076_warn-2804 [000] d... 147.926188: x86_pmu_state: } > pec_1076_warn-2804 [000] d... 147.926190: x86_pmu_enable: S0: hwc->idx: 33, hwc->last_cpu: 0, hwc->last_tag: 1 hwc->state: 0 So the problem is that x86_pmu_del(), when called from a group_sched_in() that fails (for whatever reason), and without x86_pmu TXN support (because the leader is !x86_pmu), will corrupt the n_added state. Reported-and-Tested-by: NVince Weaver <vincent.weaver@maine.edu> Signed-off-by: NPeter Zijlstra <peterz@infradead.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Stephane Eranian <eranian@google.com> Cc: Dave Jones <davej@redhat.com> Cc: <stable@vger.kernel.org> Link: http://lkml.kernel.org/r/20140221150312.GF3104@twins.programming.kicks-ass.netSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Marcelo Tosatti 提交于
Read-only large sptes can be created due to read-only faults as follows: - QEMU pagetable entry that maps guest memory is read-only due to COW. - Guest read faults such memory, COW is not broken, because it is a read-only fault. - Enable dirty logging, large spte not nuked because it is read-only. - Write-fault on such memory causes guest to loop endlessly (which must go down to level 1 because dirty logging is enabled). Fix by dropping large spte when necessary. Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 26 2月, 2014 2 次提交
-
-
由 Kees Cook 提交于
This silences build warnings about unexported variables and functions. Signed-off-by: NKees Cook <keescook@chromium.org> Link: http://lkml.kernel.org/r/20140209215644.GA30339@www.outflux.netSigned-off-by: NH. Peter Anvin <hpa@linux.intel.com>
-
由 Eugene Surovegin 提交于
Include kASLR offset in VMCOREINFO ELF notes to assist in debugging. [ hpa: pushing this for v3.14 to avoid having a kernel version with kASLR where we can't debug output. ] Signed-off-by: NEugene Surovegin <surovegin@google.com> Link: http://lkml.kernel.org/r/20140123173120.GA25474@www.outflux.netSigned-off-by: NKees Cook <keescook@chromium.org> Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>
-
- 22 2月, 2014 3 次提交
-
-
由 Stephane Eranian 提交于
This patch updates the CBOX PMU filters mapping tables for SNB-EP and IVT (model 45 and 62 respectively). The NID umask always comes in addition to another umask. When set, the NID filter is applied. The current mapping tables were missing some code/umask combinations to account for the NID umask. This patch fixes that. Cc: mingo@elte.hu Cc: ak@linux.intel.com Reviewed-by: NYan, Zheng <zheng.z.yan@intel.com> Signed-off-by: NStephane Eranian <eranian@google.com> Signed-off-by: NPeter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20140219131018.GA24475@quadSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
-
由 Peter Zijlstra 提交于
The current code simply assumes Intel Arch PerfMon v2+ to have the IA32_PERF_CAPABILITIES MSR; the SDM specifies that we should check CPUID[1].ECX[15] (aka, FEATURE_PDCM) instead. This was found by KVM which implements v2+ but didn't provide the capabilities MSR. Change the code to DTRT; KVM will also implement the MSR and return 0. Cc: pbonzini@redhat.com Reported-by: N"Michael S. Tsirkin" <mst@redhat.com> Suggested-by: NEduardo Habkost <ehabkost@redhat.com> Signed-off-by: NPeter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20140203132903.GI8874@twins.programming.kicks-ass.netSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
-
由 Markus Metzger 提交于
When using BTS on Core i7-4*, I get the below kernel warning. $ perf record -c 1 -e branches:u ls Message from syslogd@labpc1501 at Nov 11 15:49:25 ... kernel:[ 438.317893] Uhhuh. NMI received for unknown reason 31 on CPU 2. Message from syslogd@labpc1501 at Nov 11 15:49:25 ... kernel:[ 438.317920] Do you have a strange power saving mode enabled? Message from syslogd@labpc1501 at Nov 11 15:49:25 ... kernel:[ 438.317945] Dazed and confused, but trying to continue Make intel_pmu_handle_irq() take the full exit path when returning early. Cc: eranian@google.com Cc: peterz@infradead.org Cc: mingo@kernel.org Signed-off-by: NMarkus Metzger <markus.t.metzger@intel.com> Signed-off-by: NAndi Kleen <ak@linux.intel.com> Signed-off-by: NPeter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1392425048-5309-1-git-send-email-andi@firstfloor.orgSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
-
- 20 2月, 2014 2 次提交
-
-
由 Mika Westerberg 提交于
Intel Baytrail is based on Silvermont core so MSR_FSB_FREQ[2:0] == 0 means that the CPU reference clock runs at 83.3MHz. Add this missing frequency to the table. Signed-off-by: NMika Westerberg <mika.westerberg@linux.intel.com> Cc: Bin Gao <bin.gao@linux.intel.com> Cc: One Thousand Gnomes <gnomes@lxorguk.ukuu.org.uk> Cc: Ingo Molnar <mingo@kernel.org> Cc: H. Peter Anvin <hpa@zytor.com> Link: http://lkml.kernel.org/r/1392810750-18660-2-git-send-email-mika.westerberg@linux.intel.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
-
由 Thomas Gleixner 提交于
If we cannot calibrate TSC via MSR based calibration try_msr_calibrate_tsc() stores zero to fast_calibrate and returns that to the caller. This value gets then propagated further to clockevents code resulting division by zero oops like the one below: divide error: 0000 [#1] PREEMPT SMP Modules linked in: CPU: 0 PID: 1 Comm: swapper/0 Tainted: G W 3.13.0+ #47 task: ffff880075508000 ti: ffff880075506000 task.ti: ffff880075506000 RIP: 0010:[<ffffffff810aec14>] [<ffffffff810aec14>] clockevents_config.part.3+0x24/0xa0 RSP: 0000:ffff880075507e58 EFLAGS: 00010246 RAX: ffffffffffffffff RBX: ffff880079c0cd80 RCX: 0000000000000000 RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffffffffffffffff RBP: ffff880075507e70 R08: 0000000000000001 R09: 00000000000000be R10: 00000000000000bd R11: 0000000000000003 R12: 000000000000b008 R13: 0000000000000008 R14: 000000000000b010 R15: 0000000000000000 FS: 0000000000000000(0000) GS:ffff880079c00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: ffff880079fff000 CR3: 0000000001c0b000 CR4: 00000000001006f0 Stack: ffff880079c0cd80 000000000000b008 0000000000000008 ffff880075507e88 ffffffff810aecb0 ffff880079c0cd80 ffff880075507e98 ffffffff81030168 ffff880075507ed8 ffffffff81d1104f 00000000000000c3 0000000000000000 Call Trace: [<ffffffff810aecb0>] clockevents_config_and_register+0x20/0x30 [<ffffffff81030168>] setup_APIC_timer+0xc8/0xd0 [<ffffffff81d1104f>] setup_boot_APIC_clock+0x4cc/0x4d8 [<ffffffff81d0f5de>] native_smp_prepare_cpus+0x3dd/0x3f0 [<ffffffff81d02ee9>] kernel_init_freeable+0xc3/0x205 [<ffffffff8177c910>] ? rest_init+0x90/0x90 [<ffffffff8177c91e>] kernel_init+0xe/0x120 [<ffffffff8178deec>] ret_from_fork+0x7c/0xb0 [<ffffffff8177c910>] ? rest_init+0x90/0x90 Prevent this from happening by: 1) Modifying try_msr_calibrate_tsc() to return calibration value or zero if it fails. 2) Check this return value in native_calibrate_tsc() and in case of zero fallback to use normal non-MSR based calibration. [mw: Added subject and changelog] Reported-and-tested-by: NMika Westerberg <mika.westerberg@linux.intel.com> Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Cc: Bin Gao <bin.gao@linux.intel.com> Cc: One Thousand Gnomes <gnomes@lxorguk.ukuu.org.uk> Cc: Ingo Molnar <mingo@kernel.org> Cc: H. Peter Anvin <hpa@zytor.com> Link: http://lkml.kernel.org/r/1392810750-18660-1-git-send-email-mika.westerberg@linux.intel.comSigned-off-by: NMika Westerberg <mika.westerberg@linux.intel.com> Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
-
- 14 2月, 2014 3 次提交
-
-
由 Matt Fleming 提交于
Madper reported seeing the following crash, BUG: unable to handle kernel paging request at ffffffffff340003 IP: [<ffffffff81d85ba4>] efi_bgrt_init+0x9d/0x133 Call Trace: [<ffffffff81d8525d>] efi_late_init+0x9/0xb [<ffffffff81d68f59>] start_kernel+0x436/0x450 [<ffffffff81d6892c>] ? repair_env_string+0x5c/0x5c [<ffffffff81d68120>] ? early_idt_handlers+0x120/0x120 [<ffffffff81d685de>] x86_64_start_reservations+0x2a/0x2c [<ffffffff81d6871e>] x86_64_start_kernel+0x13e/0x14d This is caused because the layout of the ACPI BGRT header on this system doesn't match the definition from the ACPI spec, and so we get a bogus physical address when dereferencing ->image_address in efi_bgrt_init(). Luckily the status field in the BGRT header clearly marks it as invalid, so we can check that field and skip BGRT initialisation. Reported-by: NMadper Xie <cxie@redhat.com> Suggested-by: NToshi Kani <toshi.kani@hp.com> Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net> Cc: Matthew Garrett <mjg59@srcf.ucam.org> Cc: Josh Triplett <josh@joshtriplett.org> Signed-off-by: NMatt Fleming <matt.fleming@intel.com>
-
由 Borislav Petkov 提交于
We do not enable the new efi memmap on 32-bit and thus we need to run runtime_code_page_mkexec() unconditionally there. Fix that. Reported-and-tested-by: NLejun Zhu <lejun.zhu@intel.com> Signed-off-by: NBorislav Petkov <bp@suse.de> Signed-off-by: NMatt Fleming <matt.fleming@intel.com>
-
由 H. Peter Anvin 提交于
If CONFIG_X86_SMAP is disabled, smap_violation() tests for conditions which are incorrect (as the AC flag doesn't matter), causing spurious faults. The dynamic disabling of SMAP (nosmap on the command line) is fine because it disables X86_FEATURE_SMAP, therefore causing the static_cpu_has() to return false. Found by Fengguang Wu's test system. [ v3: move all predicates into smap_violation() ] [ v2: use IS_ENABLED() instead of #ifdef ] Reported-by: NFengguang Wu <fengguang.wu@intel.com> Link: http://lkml.kernel.org/r/20140213124550.GA30497@localhostSigned-off-by: NH. Peter Anvin <hpa@linux.intel.com> Cc: <stable@vger.kernel.org> # v3.7+
-
- 13 2月, 2014 1 次提交
-
-
由 H. Peter Anvin 提交于
If SMAP support is not compiled into the kernel, don't enable SMAP in CR4 -- in fact, we should clear it, because the kernel doesn't contain the proper STAC/CLAC instructions for SMAP support. Found by Fengguang Wu's test system. Reported-by: NFengguang Wu <fengguang.wu@intel.com> Link: http://lkml.kernel.org/r/20140213124550.GA30497@localhostSigned-off-by: NH. Peter Anvin <hpa@linux.intel.com> Cc: <stable@vger.kernel.org> # v3.7+
-
- 12 2月, 2014 1 次提交
-
-
由 Steven Rostedt (Red Hat) 提交于
When the conversion was made to remove stop machine and use the breakpoint logic instead, the modification of the function graph caller is still done directly as though it was being done under stop machine. As it is not converted via stop machine anymore, there is a possibility that the code could be layed across cache lines and if another CPU is accessing that function graph call when it is being updated, it could cause a General Protection Fault. Convert the update of the function graph caller to use the breakpoint method as well. Cc: H. Peter Anvin <hpa@zytor.com> Cc: stable@vger.kernel.org # 3.5+ Fixes: 08d636b6 "ftrace/x86: Have arch x86_64 use breakpoints instead of stop machine" Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
-
- 11 2月, 2014 2 次提交
-
-
由 Marek Szyprowski 提交于
GFP_ATOMIC is not a single gfp flag, but a macro which expands to the other flags, where meaningful is the LACK of __GFP_WAIT flag. To check if caller wants to perform an atomic allocation, the code must test for a lack of the __GFP_WAIT flag. This patch fixes the issue introduced in v3.5-rc1. CC: stable@vger.kernel.org Signed-off-by: NMarek Szyprowski <m.szyprowski@samsung.com>
-
由 Mel Gorman 提交于
Steven Noonan forwarded a users report where they had a problem starting vsftpd on a Xen paravirtualized guest, with this in dmesg: BUG: Bad page map in process vsftpd pte:8000000493b88165 pmd:e9cc01067 page:ffffea00124ee200 count:0 mapcount:-1 mapping: (null) index:0x0 page flags: 0x2ffc0000000014(referenced|dirty) addr:00007f97eea74000 vm_flags:00100071 anon_vma:ffff880e98f80380 mapping: (null) index:7f97eea74 CPU: 4 PID: 587 Comm: vsftpd Not tainted 3.12.7-1-ec2 #1 Call Trace: dump_stack+0x45/0x56 print_bad_pte+0x22e/0x250 unmap_single_vma+0x583/0x890 unmap_vmas+0x65/0x90 exit_mmap+0xc5/0x170 mmput+0x65/0x100 do_exit+0x393/0x9e0 do_group_exit+0xcc/0x140 SyS_exit_group+0x14/0x20 system_call_fastpath+0x1a/0x1f Disabling lock debugging due to kernel taint BUG: Bad rss-counter state mm:ffff880e9ca60580 idx:0 val:-1 BUG: Bad rss-counter state mm:ffff880e9ca60580 idx:1 val:1 The issue could not be reproduced under an HVM instance with the same kernel, so it appears to be exclusive to paravirtual Xen guests. He bisected the problem to commit 1667918b ("mm: numa: clear numa hinting information on mprotect") that was also included in 3.12-stable. The problem was related to how xen translates ptes because it was not accounting for the _PAGE_NUMA bit. This patch splits pte_present to add a pteval_present helper for use by xen so both bare metal and xen use the same code when checking if a PTE is present. [mgorman@suse.de: wrote changelog, proposed minor modifications] [akpm@linux-foundation.org: fix typo in comment] Reported-by: NSteven Noonan <steven@uplinklabs.net> Tested-by: NSteven Noonan <steven@uplinklabs.net> Signed-off-by: NElena Ufimtseva <ufimtseva@gmail.com> Signed-off-by: NMel Gorman <mgorman@suse.de> Reviewed-by: NDavid Vrabel <david.vrabel@citrix.com> Acked-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: <stable@vger.kernel.org> [3.12+] Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 09 2月, 2014 3 次提交
-
-
由 Steven Rostedt 提交于
When debug preempt is enabled, preempt_disable() can be traced by function and function graph tracing. There's a place in the function graph tracer that calls trace_clock() which eventually calls cycles_2_ns() outside of the recursion protection. When cycles_2_ns() calls preempt_disable() it gets traced and the graph tracer will go into a recursive loop causing a crash or worse, a triple fault. Simple fix is to use preempt_disable_notrace() in cycles_2_ns, which makes sense because the preempt_disable() tracing may use that code too, and it tracing it, even with recursion protection is rather pointless. Signed-off-by: NSteven Rostedt <rostedt@goodmis.org> Acked-by: NThomas Gleixner <tglx@linutronix.de> Signed-off-by: NPeter Zijlstra <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: http://lkml.kernel.org/r/20140204141315.2a968a72@gandalf.local.homeSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Peter Zijlstra 提交于
The current code forgets to change the CR4 state on the current CPU. Use on_each_cpu() instead of smp_call_function(). Reported-by: NMark Davies <junk@eslaf.co.uk> Suggested-by: NMark Davies <junk@eslaf.co.uk> Signed-off-by: NPeter Zijlstra <peterz@infradead.org> Cc: fweisbec@gmail.com Link: http://lkml.kernel.org/n/tip-69efsat90ibhnd577zy3z9gh@git.kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Peter Zijlstra 提交于
PPro machines can die hard when PCE gets enabled due to a CPU erratum. The safe way it so disable it by default and keep it disabled. See erratum 26 in: http://download.intel.com/design/archives/processors/pro/docs/24268935.pdfReported-and-Tested-by: NMark Davies <junk@eslaf.co.uk> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Cc: Stephane Eranian <eranian@google.com> Cc: Vince Weaver <vince@deater.net> Signed-off-by: NPeter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20140206170815.GW2936@laptop.programming.kicks-ass.netSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
- 07 2月, 2014 3 次提交
-
-
由 Tang Chen 提交于
The following path will cause array out of bound. memblock_add_region() will always set nid in memblock.reserved to MAX_NUMNODES. In numa_register_memblks(), after we set all nid to correct valus in memblock.reserved, we called setup_node_data(), and used memblock_alloc_nid() to allocate memory, with nid set to MAX_NUMNODES. The nodemask_t type can be seen as a bit array. And the index is 0 ~ MAX_NUMNODES-1. After that, when we call node_set() in numa_clear_kernel_node_hotplug(), the nodemask_t got an index of value MAX_NUMNODES, which is out of [0 ~ MAX_NUMNODES-1]. See below: numa_init() |---> numa_register_memblks() | |---> memblock_set_node(memory) set correct nid in memblock.memory | |---> memblock_set_node(reserved) set correct nid in memblock.reserved | |...... | |---> setup_node_data() | |---> memblock_alloc_nid() here, nid is set to MAX_NUMNODES (1024) |...... |---> numa_clear_kernel_node_hotplug() |---> node_set() here, we have an index 1024, and overflowed This patch moves nid setting to numa_clear_kernel_node_hotplug() to fix this problem. Reported-by: NDave Jones <davej@redhat.com> Signed-off-by: NTang Chen <tangchen@cn.fujitsu.com> Tested-by: NGu Zheng <guz.fnst@cn.fujitsu.com> Reported-by: NDave Jones <davej@redhat.com> Cc: David Rientjes <rientjes@google.com> Tested-by: NDave Jones <davej@redhat.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Tang Chen 提交于
On-stack variable numa_kernel_nodes in numa_clear_kernel_node_hotplug() was not initialized. So we need to initialize it. [akpm@linux-foundation.org: use NODE_MASK_NONE, per David] Signed-off-by: NTang Chen <tangchen@cn.fujitsu.com> Tested-by: NGu Zheng <guz.fnst@cn.fujitsu.com> Reported-by: NDave Jones <davej@redhat.com> Reported-by: NDavid Rientjes <rientjes@google.com> Tested-by: NDave Jones <davej@redhat.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Borislav Petkov 提交于
For additional coverage, BorisO and friends unknowlingly did swap AMD microcode with Intel microcode blobs in order to see what happens. What did happen on 32-bit was [ 5.722656] BUG: unable to handle kernel paging request at be3a6008 [ 5.722693] IP: [<c106d6b4>] load_microcode_amd+0x24/0x3f0 [ 5.722716] *pdpt = 0000000000000000 *pde = 0000000000000000 because there was a valid initrd there but without valid microcode in it and the container check happened *after* the relocated ramdisk handling on 32-bit, which was clearly wrong. While at it, take care of the ramdisk relocation on both 32- and 64-bit as it is done on both. Also, comment what we're doing because this code is a bit tricky. Reported-and-tested-by: NBoris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: NBorislav Petkov <bp@suse.de> Link: http://lkml.kernel.org/r/1391460104-7261-1-git-send-email-bp@alien8.deSigned-off-by: NH. Peter Anvin <hpa@zytor.com>
-