- 21 11月, 2016 1 次提交
-
-
由 Josh Poimboeuf 提交于
NMI stack dumps are bracketed by the following tags: <NMI> ... <EOE> The ending tag is kind of confusing if you don't already know what "EOE" means (end of exception). The same ending tag is also used to mark the end of all other exceptions' stacks. For example: <#DF> ... <EOE> And similarly, "EOI" is used as the ending tag for interrupts: <IRQ> ... <EOI> Change the tags to be more comprehensible by making them symmetrical and more XML-esque: <NMI> ... </NMI> <#DF> ... </#DF> <IRQ> ... </IRQ> Signed-off-by: NJosh Poimboeuf <jpoimboe@redhat.com> Acked-by: NFrederic Weisbecker <fweisbec@gmail.com> Acked-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/180196e3754572540b595bc56b947d43658979a7.1479491159.git.jpoimboe@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
- 17 11月, 2016 2 次提交
-
-
由 Josh Poimboeuf 提交于
When show_trace_log_lvl() is called from show_regs(), it completely fails to dump the stack. This bug was introduced when show_stack_log_lvl() was removed with the following commit: 0ee1dd9f ("x86/dumpstack: Remove raw stack dump") Previous callers of that function now call show_trace_log_lvl() directly. That resulted in a subtle change, in that the 'stack' argument can now be NULL in certain cases. A NULL 'stack' pointer means that the stack dump should start from the topmost stack frame unless 'regs' is valid, in which case it should start from 'regs->sp'. Signed-off-by: NJosh Poimboeuf <jpoimboe@redhat.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Fixes: 0ee1dd9f ("x86/dumpstack: Remove raw stack dump") Link: http://lkml.kernel.org/r/c551842302a9c222d96a14e42e4003f059509f69.1479362652.git.jpoimboe@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Gayatri Kammela 提交于
Add a few new AVX512 instruction groups/features for enumeration in /proc/cpuinfo: AVX512IFMA and AVX512VBMI. Clear the flags in fpu_xstate_clear_all_cpu_caps(). CPUID.(EAX=7,ECX=0):EBX[bit 21] AVX512IFMA CPUID.(EAX=7,ECX=0):ECX[bit 1] AVX512VBMI Detailed information of cpuid bits for the features can be found at https://bugzilla.kernel.org/show_bug.cgi?id=187891Signed-off-by: NGayatri Kammela <gayatri.kammela@intel.com> Reviewed-by: NBorislav Petkov <bp@alien8.de> Cc: Ravi Shankar <ravi.v.shankar@intel.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: mingo@elte.hu Link: http://lkml.kernel.org/r/1479327060-18668-1-git-send-email-gayatri.kammela@intel.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
-
- 16 11月, 2016 2 次提交
-
-
由 He Chen 提交于
Sparse populated CPUID leafs are collected in a software provided leaf to avoid bloat of the x86_capability array, but there is no way to rebuild the real leafs (e.g. for KVM CPUID enumeration) other than rereading the CPUID leaf from the CPU. While this is possible it is problematic as it does not take software disabled features into account. If a feature is disabled on the host it should not be exposed to a guest either. Add get_scattered_cpuid_leaf() which rebuilds the leaf from the scattered cpuid table information and the active CPU features. [ tglx: Rewrote changelog ] Signed-off-by: NHe Chen <he.chen@linux.intel.com> Reviewed-by: NBorislav Petkov <bp@suse.de> Cc: Luwei Kang <luwei.kang@intel.com> Cc: kvm@vger.kernel.org Cc: Radim Krčmář <rkrcmar@redhat.com> Cc: Piotr Luc <Piotr.Luc@intel.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Paolo Bonzini <pbonzini@redhat.com> Link: http://lkml.kernel.org/r/1478856336-9388-3-git-send-email-he.chen@linux.intel.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
-
由 He Chen 提交于
cpuid_regs is defined multiple times as structure and enum. Rename the enum and move all of it to processor.h so we don't end up with more instances. Rename the misnomed register enumeration from CR_* to the obvious CPUID_*. [ tglx: Rewrote changelog ] Signed-off-by: NHe Chen <he.chen@linux.intel.com> Reviewed-by: NBorislav Petkov <bp@alien8.de> Cc: Luwei Kang <luwei.kang@intel.com> Cc: kvm@vger.kernel.org Cc: Radim Krčmář <rkrcmar@redhat.com> Cc: Piotr Luc <Piotr.Luc@intel.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Link: http://lkml.kernel.org/r/1478856336-9388-2-git-send-email-he.chen@linux.intel.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
-
- 12 11月, 2016 1 次提交
-
-
由 Arnd Bergmann 提交于
apm_bios_call() can fail, and return a status in its argument structure. If that status however is zero during a call from apm_get_power_status(), we end up using data that may have never been set, as reported by "gcc -Wmaybe-uninitialized": arch/x86/kernel/apm_32.c: In function ‘apm’: arch/x86/kernel/apm_32.c:1729:17: error: ‘bx’ may be used uninitialized in this function [-Werror=maybe-uninitialized] arch/x86/kernel/apm_32.c:1835:5: error: ‘cx’ may be used uninitialized in this function [-Werror=maybe-uninitialized] arch/x86/kernel/apm_32.c:1730:17: note: ‘cx’ was declared here arch/x86/kernel/apm_32.c:1842:27: error: ‘dx’ may be used uninitialized in this function [-Werror=maybe-uninitialized] arch/x86/kernel/apm_32.c:1731:17: note: ‘dx’ was declared here This changes the function to return "APM_NO_ERROR" here, which makes the code more robust to broken BIOS versions, and avoids the warning. Signed-off-by: NArnd Bergmann <arnd@arndb.de> Reviewed-by: NJiri Kosina <jkosina@suse.cz> Reviewed-by: NLuis R. Rodriguez <mcgrof@kernel.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 29 10月, 2016 1 次提交
-
-
由 Thomas Gleixner 提交于
The recent changes, which forced the registration of the boot cpu on UP systems, which do not have ACPI tables, have been fixed for systems w/o local APIC, but left a wreckage for systems which have neither ACPI nor mptables, but the CPU has an APIC, e.g. virtualbox. The boot process crashes in prefill_possible_map() as it wants to register the boot cpu, which needs to access the local apic, but the local APIC is not yet mapped. There is no reason why init_apic_mapping() can't be invoked before prefill_possible_map(). So instead of playing another silly early mapping game, as the ACPI/mptables code does, we just move init_apic_mapping() before the call to prefill_possible_map(). In hindsight, I should have noticed that combination earlier. Sorry for the churn (also in stable)! Fixes: ff856051 ("x86/boot/smp: Don't try to poke disabled/non-existent APIC") Reported-and-debugged-by: NMichal Necasek <michal.necasek@oracle.com> Reported-and-tested-by: NWolfgang Bauer <wbauer@tmo.at> Cc: prarit@redhat.com Cc: ville.syrjala@linux.intel.com Cc: michael.thayer@oracle.com Cc: knut.osmundsen@oracle.com Cc: frank.mehnert@oracle.com Cc: Borislav Petkov <bp@alien8.de> Cc: stable@vger.kernel.org Link: http://lkml.kernel.org/r/alpine.DEB.2.20.1610282114380.5053@nanosSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
-
- 28 10月, 2016 2 次提交
-
-
由 Borislav Petkov 提交于
We needed the physical address of the container in order to compute the offset within the relocated ramdisk. And we did this by doing __pa() on the virtual address. However, __pa() does checks whether the physical address is within PAGE_OFFSET and __START_KERNEL_map - see __phys_addr() - which fail if we have CONFIG_RANDOMIZE_MEMORY enabled: we feed a virtual address which *doesn't* have the randomization offset into a function which uses PAGE_OFFSET which *does* have that offset. This makes this check fire: VIRTUAL_BUG_ON((x > y) || !phys_addr_valid(x)); ^^^^^^ due to the randomization offset. The fix is as simple as using __pa_nodebug() because we do that randomization offset accounting later in that function ourselves. Reported-by: NBob Peterson <rpeterso@redhat.com> Tested-by: NBob Peterson <rpeterso@redhat.com> Signed-off-by: NBorislav Petkov <bp@suse.de> Cc: Andreas Gruenbacher <agruenba@redhat.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Whitehouse <swhiteho@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-mm <linux-mm@kvack.org> Cc: stable@vger.kernel.org # 4.9 Link: http://lkml.kernel.org/r/20161027123623.j2jri5bandimboff@pd.tnicSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Josh Poimboeuf 提交于
Add a sanity check to ensure the stack only grows down, and print a warning if the check fails. Signed-off-by: NJosh Poimboeuf <jpoimboe@redhat.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20161027131058.tpdffwlqipv7pcd6@trebleSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
- 27 10月, 2016 3 次提交
-
-
由 Josh Poimboeuf 提交于
If __kernel_text_address() doesn't recognize a return address on the stack, it probably means that it's some generated code which __kernel_text_address() doesn't know about yet. Otherwise there's probably some stack corruption. Either way, warn about it. Use printk_deferred_once() because the unwinder can be called with the console lock by lockdep via save_stack_trace(). Signed-off-by: NJosh Poimboeuf <jpoimboe@redhat.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/2d897898f324e275943b590d160b55e482bba65f.1477496147.git.jpoimboe@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Josh Poimboeuf 提交于
Print a warning if stack recursion is detected. Use printk_deferred_once() because the unwinder can be called with the console lock by lockdep via save_stack_trace(). Signed-off-by: NJosh Poimboeuf <jpoimboe@redhat.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/def18247aafaab480844484398e793f552b79bda.1477496147.git.jpoimboe@redhat.com [ Unbroke the lines. ] Signed-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Josh Poimboeuf 提交于
Detect situations in the unwinder where the frame pointer refers to a bad address, and print an appropriate warning. Use printk_deferred_once() because the unwinder can be called with the console lock by lockdep via save_stack_trace(). Signed-off-by: NJosh Poimboeuf <jpoimboe@redhat.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/03c888f6f7414d54fa56b393ea25482be6899b5f.1477496147.git.jpoimboe@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
- 26 10月, 2016 3 次提交
-
-
由 Steven Rostedt 提交于
Commit 784d5699 ("x86: move exports to actual definitions") removed the EXPORT_SYMBOL(__fentry__) and EXPORT_SYMBOL(mcount) from x8664_ksyms_64.c, and added EXPORT_SYMBOL(function_hook) in mcount_64.S instead. The problem is that function_hook isn't a function at all, but a macro that is defined as either mcount or __fentry__ depending on the support from gcc. Originally, I thought this was a macro issue, like what __stringify() is used for. But the problem is a bit deeper. The Makefile.build has some magic that does post processing of files to create the CRC bindings. It does some searches for EXPORT_SYMBOL() and because it finds a macro name and not the actual functions, this causes function_hook not to be converted into mcount or __fentry__ and they are missed. Instead of adding more magic to Makefile.build, just add EXPORT_SYMBOL() for mcount and __fentry__ where the ifdef is used. Since this is assembly and not C, it doesn't require being set after the function is defined. Signed-off-by: NSteven Rostedt <rostedt@goodmis.org> Tested-by: NBorislav Petkov <bp@alien8.de> Cc: Gabriel C <nix.or.die@gmail.com> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Al Viro <viro@ZenIV.linux.org.uk> Link: http://lkml.kernel.org/r/20161024150148.4f9d90e4@gandalf.local.homeSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
-
由 Josh Poimboeuf 提交于
For mostly historical reasons, the x86 oops dump shows the raw stack values: ... [registers] Stack: ffff880079af7350 ffff880079905400 0000000000000000 ffffc900008f3ae0 ffffffffa0196610 0000000000000001 00010000ffffffff 0000000087654321 0000000000000002 0000000000000000 0000000000000000 0000000000000000 Call Trace: ... This seems to be an artifact from long ago, and probably isn't needed anymore. It generally just adds noise to the dump, and it can be actively harmful because it leaks kernel addresses. Linus says: "The stack dump actually goes back to forever, and it used to be useful back in 1992 or so. But it used to be useful mainly because stacks were simpler and we didn't have very good call traces anyway. I definitely remember having used them - I just do not remember having used them in the last ten+ years. Of course, it's still true that if you can trigger an oops, you've likely already lost the security game, but since the stack dump is so useless, let's aim to just remove it and make games like the above harder." This also removes the related 'kstack=' cmdline option and the 'kstack_depth_to_print' sysctl. Suggested-by: NLinus Torvalds <torvalds@linux-foundation.org> Signed-off-by: NJosh Poimboeuf <jpoimboe@redhat.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/e83bd50df52d8fe88e94d2566426ae40d813bf8f.1477405374.git.jpoimboe@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Josh Poimboeuf 提交于
Printing kernel text addresses in stack dumps is of questionable value, especially now that address randomization is becoming common. It can be a security issue because it leaks kernel addresses. It also affects the usefulness of the stack dump. Linus says: "I actually spend time cleaning up commit messages in logs, because useless data that isn't actually information (random hex numbers) is actively detrimental. It makes commit logs less legible. It also makes it harder to parse dumps. It's not useful. That makes it actively bad. I probably look at more oops reports than most people. I have not found the hex numbers useful for the last five years, because they are just randomized crap. The stack content thing just makes code scroll off the screen etc, for example." The only real downside to removing these addresses is that they can be used to disambiguate duplicate symbol names. However such cases are rare, and the context of the stack dump should be enough to be able to figure it out. There's now a 'faddr2line' script which can be used to convert a function address to a file name and line: $ ./scripts/faddr2line ~/k/vmlinux write_sysrq_trigger+0x51/0x60 write_sysrq_trigger+0x51/0x60: write_sysrq_trigger at drivers/tty/sysrq.c:1098 Or gdb can be used: $ echo "list *write_sysrq_trigger+0x51" |gdb ~/k/vmlinux |grep "is in" (gdb) 0xffffffff815b5d83 is in driver_probe_device (/home/jpoimboe/git/linux/drivers/base/dd.c:378). (But note that when there are duplicate symbol names, gdb will only show the first symbol it finds. faddr2line is recommended over gdb because it handles duplicates and it also does function size checking.) Here's an example of what a stack dump looks like after this change: BUG: unable to handle kernel NULL pointer dereference at (null) IP: sysrq_handle_crash+0x45/0x80 PGD 36bfa067 [ 29.650644] PUD 7aca3067 Oops: 0002 [#1] PREEMPT SMP Modules linked in: ... CPU: 1 PID: 786 Comm: bash Tainted: G E 4.9.0-rc1+ #1 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.9.1-1.fc24 04/01/2014 task: ffff880078582a40 task.stack: ffffc90000ba8000 RIP: 0010:sysrq_handle_crash+0x45/0x80 RSP: 0018:ffffc90000babdc8 EFLAGS: 00010296 RAX: ffff880078582a40 RBX: 0000000000000063 RCX: 0000000000000001 RDX: 0000000000000001 RSI: 0000000000000000 RDI: 0000000000000292 RBP: ffffc90000babdc8 R08: 0000000b31866061 R09: 0000000000000000 R10: 0000000000000001 R11: 0000000000000000 R12: 0000000000000000 R13: 0000000000000007 R14: ffffffff81ee8680 R15: 0000000000000000 FS: 00007ffb43869700(0000) GS:ffff88007d400000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000000 CR3: 000000007a3e9000 CR4: 00000000001406e0 Stack: ffffc90000babe00 ffffffff81572d08 ffffffff81572bd5 0000000000000002 0000000000000000 ffff880079606600 00007ffb4386e000 ffffc90000babe20 ffffffff81573201 ffff880036a3fd00 fffffffffffffffb ffffc90000babe40 Call Trace: __handle_sysrq+0x138/0x220 ? __handle_sysrq+0x5/0x220 write_sysrq_trigger+0x51/0x60 proc_reg_write+0x42/0x70 __vfs_write+0x37/0x140 ? preempt_count_sub+0xa1/0x100 ? __sb_start_write+0xf5/0x210 ? vfs_write+0x183/0x1a0 vfs_write+0xb8/0x1a0 SyS_write+0x58/0xc0 entry_SYSCALL_64_fastpath+0x1f/0xc2 RIP: 0033:0x7ffb42f55940 RSP: 002b:00007ffd33bb6b18 EFLAGS: 00000246 ORIG_RAX: 0000000000000001 RAX: ffffffffffffffda RBX: 0000000000000046 RCX: 00007ffb42f55940 RDX: 0000000000000002 RSI: 00007ffb4386e000 RDI: 0000000000000001 RBP: 0000000000000011 R08: 00007ffb4321ea40 R09: 00007ffb43869700 R10: 00007ffb43869700 R11: 0000000000000246 R12: 0000000000778a10 R13: 00007ffd33bb5c00 R14: 0000000000000007 R15: 0000000000000010 Code: 34 e8 d0 34 bc ff 48 c7 c2 3b 2b 57 81 be 01 00 00 00 48 c7 c7 e0 dd e5 81 e8 a8 55 ba ff c7 05 0e 3f de 00 01 00 00 00 0f ae f8 <c6> 04 25 00 00 00 00 01 5d c3 e8 4c 49 bc ff 84 c0 75 c3 48 c7 RIP: sysrq_handle_crash+0x45/0x80 RSP: ffffc90000babdc8 CR2: 0000000000000000 Suggested-by: NLinus Torvalds <torvalds@linux-foundation.org> Signed-off-by: NJosh Poimboeuf <jpoimboe@redhat.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/69329cb29b8f324bb5fcea14d61d224807fb6488.1477405374.git.jpoimboe@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
- 25 10月, 2016 2 次提交
-
-
由 Arnd Bergmann 提交于
gcc -Wmaybe-uninitialized detects that quirk_intel_brickland_xeon_ras_cap uses uninitialized data when CONFIG_PCI is not set: arch/x86/kernel/quirks.c: In function ‘quirk_intel_brickland_xeon_ras_cap’: arch/x86/kernel/quirks.c:641:13: error: ‘capid0’ is used uninitialized in this function [-Werror=uninitialized] However, the function is also not called in this configuration, so we can avoid the warning by moving the existing #ifdef to cover it as well. Signed-off-by: NArnd Bergmann <arnd@arndb.de> Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: Borislav Petkov <bp@suse.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tony Luck <tony.luck@intel.com> Cc: linux-pci@vger.kernel.org Link: http://lkml.kernel.org/r/20161024153325.2752428-1-arnd@arndb.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Josh Poimboeuf 提交于
Vince Waver reported the following bug: WARNING: CPU: 0 PID: 21338 at arch/x86/mm/fault.c:435 vmalloc_fault+0x58/0x1f0 CPU: 0 PID: 21338 Comm: perf_fuzzer Not tainted 4.8.0+ #37 Hardware name: Hewlett-Packard HP Compaq Pro 6305 SFF/1850, BIOS K06 v02.57 08/16/2013 Call Trace: <NMI> ? dump_stack+0x46/0x59 ? __warn+0xd5/0xee ? vmalloc_fault+0x58/0x1f0 ? __do_page_fault+0x6d/0x48e ? perf_log_throttle+0xa4/0xf4 ? trace_page_fault+0x22/0x30 ? __unwind_start+0x28/0x42 ? perf_callchain_kernel+0x75/0xac ? get_perf_callchain+0x13a/0x1f0 ? perf_callchain+0x6a/0x6c ? perf_prepare_sample+0x71/0x2eb ? perf_event_output_forward+0x1a/0x54 ? __default_send_IPI_shortcut+0x10/0x2d ? __perf_event_overflow+0xfb/0x167 ? x86_pmu_handle_irq+0x113/0x150 ? native_read_msr+0x6/0x34 ? perf_event_nmi_handler+0x22/0x39 ? perf_ibs_nmi_handler+0x4a/0x51 ? perf_event_nmi_handler+0x22/0x39 ? nmi_handle+0x4d/0xf0 ? perf_ibs_handle_irq+0x3d1/0x3d1 ? default_do_nmi+0x3c/0xd5 ? do_nmi+0x92/0x102 ? end_repeat_nmi+0x1a/0x1e ? entry_SYSCALL_64_after_swapgs+0x12/0x4a ? entry_SYSCALL_64_after_swapgs+0x12/0x4a ? entry_SYSCALL_64_after_swapgs+0x12/0x4a <EOE> ^A4---[ end trace 632723104d47d31a ]--- BUG: stack guard page was hit at ffffc90008500000 (stack is ffffc900084fc000..ffffc900084fffff) kernel stack overflow (page fault): 0000 [#1] SMP ... The NMI hit in the entry code right after setting up the stack pointer from 'cpu_current_top_of_stack', so the kernel stack was empty. The 'guess' version of __unwind_start() attempted to dereference the "top of stack" pointer, which is not actually *on* the stack. Add a check in the guess unwinder to deal with an empty stack. (The frame pointer unwinder already has such a check.) Reported-by: NVince Weaver <vincent.weaver@maine.edu> Signed-off-by: NJosh Poimboeuf <jpoimboe@redhat.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Fixes: 7c7900f8 ("x86/unwind: Add new unwind interface and implementations") Link: http://lkml.kernel.org/r/20161024133127.e5evgeebdbohnmpb@trebleSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
- 24 10月, 2016 1 次提交
-
-
由 Sinan Kaya 提交于
Ondrej reported that IRQs stopped working in v4.7 on several platforms. A typical scenario, from Ondrej's VT82C694X/694X, is: ACPI: Using PIC for interrupt routing ACPI: PCI Interrupt Link [LNKA] (IRQs 1 3 4 5 6 7 10 *11 12 14 15) ACPI: No IRQ available for PCI Interrupt Link [LNKA] 8139too 0000:00:0f.0: PCI INT A: no GSI We're using PIC routing, so acpi_irq_balance == 0, and LNKA is already active at IRQ 11. In that case, acpi_pci_link_allocate() only tries to use the active IRQ (IRQ 11) which also happens to be the SCI. We should penalize the SCI by PIRQ_PENALTY_PCI_USING, but irq_get_trigger_type(11) returns something other than IRQ_TYPE_LEVEL_LOW, so we penalize it by PIRQ_PENALTY_ISA_ALWAYS instead, which makes acpi_pci_link_allocate() assume the IRQ isn't available and give up. Add acpi_penalize_sci_irq() so platforms can tell us the SCI IRQ, trigger, and polarity directly and we don't have to depend on irq_get_trigger_type(). Fixes: 103544d8 (ACPI,PCI,IRQ: reduce resource requirements) Link: http://lkml.kernel.org/r/201609251512.05657.linux@rainbow-software.orgReported-by: NOndrej Zary <linux@rainbow-software.org> Acked-by: NBjorn Helgaas <bhelgaas@google.com> Signed-off-by: NSinan Kaya <okaya@codeaurora.org> Tested-by: NJonathan Liu <net147@gmail.com> Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
-
- 22 10月, 2016 1 次提交
-
-
由 Ville Syrjälä 提交于
Apparently trying to poke a disabled or non-existent APIC leads to a box that doesn't even boot. Let's not do that. No real clue if this is the right fix, but at least my P3 machine boots again. Signed-off-by: NVille Syrjälä <ville.syrjala@linux.intel.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Borislav Petkov <bp@suse.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Eric Biederman <ebiederm@xmission.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Juergen Gross <jgross@suse.com> Cc: Len Brown <len.brown@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Prarit Bhargava <prarit@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Yinghai Lu <yinghai@kernel.org> Cc: dyoung@redhat.com Cc: kexec@lists.infradead.org Cc: stable@vger.kernel.org Fixes: 2a51fe08 ("arch/x86: Handle non enumerated CPU after physical hotplug") Link: http://lkml.kernel.org/r/1477102684-5092-1-git-send-email-ville.syrjala@linux.intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
- 21 10月, 2016 6 次提交
-
-
由 Josh Poimboeuf 提交于
The value of regs->orig_ax contains potentially useful debugging data: For syscalls it contains the syscall number. For interrupts it contains the (negated) vector number. To reduce noise, print it only if it has a useful value (i.e., something other than -1). Here's what it looks like for a write syscall: RIP: 0033:[<00007f53ad7b1940>] 0x7f53ad7b1940 RSP: 002b:00007fff8de66558 EFLAGS: 00000246 ORIG_RAX: 0000000000000001 RAX: ffffffffffffffda RBX: 0000000000000046 RCX: 00007f53ad7b1940 RDX: 0000000000000002 RSI: 00007f53ae0ca000 RDI: 0000000000000001 ... Suggested-by: NAndy Lutomirski <luto@amacapital.net> Signed-off-by: NJosh Poimboeuf <jpoimboe@redhat.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/93f0fe0307a4af884d3fca00edabcc8cff236002.1476973742.git.jpoimboe@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Josh Poimboeuf 提交于
The RIP address is shown twice in __show_regs(). Before: RIP: 0010:[<ffffffff81070446>] [<ffffffff81070446>] native_write_msr+0x6/0x30 After: RIP: 0010:[<ffffffff81070446>] native_write_msr+0x6/0x30 Signed-off-by: NJosh Poimboeuf <jpoimboe@redhat.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/b3fda66f36761759b000883b059cdd9a7649dcc1.1476973742.git.jpoimboe@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Josh Poimboeuf 提交于
Now that we can find pt_regs registers on the stack, print them. Here's an example of what it looks like: Call Trace: <IRQ> [<ffffffff8144b793>] dump_stack+0x86/0xc3 [<ffffffff81142c73>] hrtimer_interrupt+0xb3/0x1c0 [<ffffffff8105eb86>] local_apic_timer_interrupt+0x36/0x60 [<ffffffff818b27cd>] smp_apic_timer_interrupt+0x3d/0x50 [<ffffffff818b06ee>] apic_timer_interrupt+0x9e/0xb0 RIP: 0010:[<ffffffff818aef43>] [<ffffffff818aef43>] _raw_spin_unlock_irq+0x33/0x60 RSP: 0018:ffff880079c4f760 EFLAGS: 00000202 RAX: ffff880078738000 RBX: ffff88007d3da0c0 RCX: 0000000000000007 RDX: 0000000000006d78 RSI: ffff8800787388f0 RDI: ffff880078738000 RBP: ffff880079c4f768 R08: 0000002199088f38 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000000 R12: ffffffff81e0d540 R13: ffff8800369fb700 R14: 0000000000000000 R15: ffff880078738000 <EOI> [<ffffffff810e1f14>] finish_task_switch+0xb4/0x250 [<ffffffff810e1ed6>] ? finish_task_switch+0x76/0x250 [<ffffffff818a7b61>] __schedule+0x3e1/0xb20 ... [<ffffffff810759c8>] trace_do_page_fault+0x58/0x2c0 [<ffffffff8106f7dc>] do_async_page_fault+0x2c/0xa0 [<ffffffff818b1dd8>] async_page_fault+0x28/0x30 RIP: 0010:[<ffffffff8145b062>] [<ffffffff8145b062>] __clear_user+0x42/0x70 RSP: 0018:ffff880079c4fd38 EFLAGS: 00010202 RAX: 0000000000000000 RBX: 0000000000000138 RCX: 0000000000000138 RDX: 0000000000000000 RSI: 0000000000000008 RDI: 000000000061b640 RBP: ffff880079c4fd48 R08: 0000002198feefd7 R09: ffffffff82a40928 R10: 0000000000000001 R11: 0000000000000000 R12: 000000000061b640 R13: 0000000000000000 R14: ffff880079c50000 R15: ffff8800791d7400 [<ffffffff8145b043>] ? __clear_user+0x23/0x70 [<ffffffff8145b0fb>] clear_user+0x2b/0x40 [<ffffffff812fbda2>] load_elf_binary+0x1472/0x1750 [<ffffffff8129a591>] search_binary_handler+0xa1/0x200 [<ffffffff8129b69b>] do_execveat_common.isra.36+0x6cb/0x9f0 [<ffffffff8129b5f3>] ? do_execveat_common.isra.36+0x623/0x9f0 [<ffffffff8129bcaa>] SyS_execve+0x3a/0x50 [<ffffffff81003f5c>] do_syscall_64+0x6c/0x1e0 [<ffffffff818afa3f>] entry_SYSCALL64_slow_path+0x25/0x25 RIP: 0033:[<00007fd2e2f2e537>] [<00007fd2e2f2e537>] 0x7fd2e2f2e537 RSP: 002b:00007ffc449c5fc8 EFLAGS: 00000246 RAX: ffffffffffffffda RBX: 00007ffc449c8860 RCX: 00007fd2e2f2e537 RDX: 000000000127cc40 RSI: 00007ffc449c8860 RDI: 00007ffc449c6029 RBP: 00007ffc449c60b0 R08: 65726f632d667265 R09: 00007ffc449c5e20 R10: 00000000000005a7 R11: 0000000000000246 R12: 000000000127cc40 R13: 000000000127ce05 R14: 00007ffc449c6029 R15: 000000000127ce01 Signed-off-by: NJosh Poimboeuf <jpoimboe@redhat.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/5cc2c512ec82cfba00dd22467644d4ed751a48c0.1476973742.git.jpoimboe@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Josh Poimboeuf 提交于
show_trace_log_lvl() prints the stack id (e.g. "<IRQ>") without a newline so that any stack address printed after it will appear on the same line. That causes the first stack address to be vertically misaligned with the rest, making it visually cluttered and slightly confusing: Call Trace: <IRQ> [<ffffffff814431c3>] dump_stack+0x86/0xc3 [<ffffffff8100828b>] perf_callchain_kernel+0x14b/0x160 [<ffffffff811e915f>] get_perf_callchain+0x15f/0x2b0 ... <EOI> [<ffffffff8189c6c3>] ? _raw_spin_unlock_irq+0x33/0x60 [<ffffffff810e1c84>] finish_task_switch+0xb4/0x250 [<ffffffff8106f7dc>] do_async_page_fault+0x2c/0xa0 It will look worse once we start printing pt_regs registers found in the middle of the stack: <IRQ> RIP: 0010:[<ffffffff8189c6c3>] [<ffffffff8189c6c3>] _raw_spin_unlock_irq+0x33/0x60 RSP: 0018:ffff88007876f720 EFLAGS: 00000206 RAX: ffff8800786caa40 RBX: ffff88007d5da140 RCX: 0000000000000007 ... Improve readability by adding a newline to the stack name: Call Trace: <IRQ> [<ffffffff814431c3>] dump_stack+0x86/0xc3 [<ffffffff8100828b>] perf_callchain_kernel+0x14b/0x160 [<ffffffff811e915f>] get_perf_callchain+0x15f/0x2b0 ... <EOI> [<ffffffff8189c6c3>] ? _raw_spin_unlock_irq+0x33/0x60 [<ffffffff810e1c84>] finish_task_switch+0xb4/0x250 [<ffffffff8106f7dc>] do_async_page_fault+0x2c/0xa0 Now that "continued" lines are no longer needed, we can also remove the hack of using the empty string (aka KERN_CONT) and replace it with KERN_DEFAULT. Signed-off-by: NJosh Poimboeuf <jpoimboe@redhat.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/9bdd6dee2c74555d45500939fcc155997dc7889e.1476973742.git.jpoimboe@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Josh Poimboeuf 提交于
The entry code doesn't encode the pt_regs pointer for syscalls. But the pt_regs are always at the same location, so we can add a manual check for them. A later patch prints them as part of the oops stack dump. They could be useful, for example, to determine the arguments to a system call. Signed-off-by: NJosh Poimboeuf <jpoimboe@redhat.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/e176aa9272930cd3f51fda0b94e2eae356677da4.1476973742.git.jpoimboe@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Josh Poimboeuf 提交于
With frame pointers, when a task is interrupted, its stack is no longer completely reliable because the function could have been interrupted before it had a chance to save the previous frame pointer on the stack. So the caller of the interrupted function could get skipped by a stack trace. This is problematic for live patching, which needs to know whether a stack trace of a sleeping task can be relied upon. There's currently no way to detect if a sleeping task was interrupted by a page fault exception or preemption before it went to sleep. Another issue is that when dumping the stack of an interrupted task, the unwinder has no way of knowing where the saved pt_regs registers are, so it can't print them. This solves those issues by encoding the pt_regs pointer in the frame pointer on entry from an interrupt or an exception. This patch also updates the unwinder to be able to decode it, because otherwise the unwinder would be broken by this change. Note that this causes a change in the behavior of the unwinder: each instance of a pt_regs on the stack is now considered a "frame". So callers of unwind_get_return_address() will now get an occasional 'regs->ip' address that would have previously been skipped over. Suggested-by: NAndy Lutomirski <luto@amacapital.net> Signed-off-by: NJosh Poimboeuf <jpoimboe@redhat.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/8b9f84a21e39d249049e0547b559ff8da0df0988.1476973742.git.jpoimboe@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
- 20 10月, 2016 8 次提交
-
-
由 Dmitry Safonov 提交于
The recent introduction of SA_X32/IA32 sa_flags added a check for user_64bit_mode() into sigaction_compat_abi(). user_64bit_mode() is true for native 64-bit processes and x32 processes. Due to that the function returns w/o setting the SA_X32_ABI flag for X32 processes. In consequence the kernel attempts to deliver the signal to the X32 process in native 64-bit mode causing the process to segfault. Remove the check, so the actual check for X32 mode which sets the ABI flag can be reached. There is no side effect for native 64-bit mode. [ tglx: Rewrote changelog ] Fixes: 68463510 ("x86/signal: Add SA_{X32,IA32}_ABI sa_flags") Reported-by: NMikulas Patocka <mpatocka@redhat.com> Tested-by: NAdam Borowski <kilobyte@angband.pl> Signed-off-by: NDmitry Safonov <0x7f454c46@gmail.com> Cc: Dmitry Safonov <dsafonov@virtuozzo.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: linux-mm@kvack.org Cc: Cyrill Gorcunov <gorcunov@openvz.org> Cc: Pavel Emelyanov <xemul@virtuozzo.com> Link: http://lkml.kernel.org/r/CAJwJo6Z8ZWPqNfT6t-i8GW1MKxQrKDUagQqnZ%2B0%2B697%3DMyVeGg@mail.gmail.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de> Signed-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Josh Poimboeuf 提交于
When core_kernel_text() is used to determine whether an address on a task's stack trace is a kernel text address, it incorrectly returns false for early text addresses for the head code between the _text and _stext markers. Among other things, this can cause the unwinder to behave incorrectly when unwinding to x86 head code. Head code is text code too, so mark it as such. This seems to match the intent of other users of the _stext symbol, and it also seems consistent with what other architectures are already doing. Signed-off-by: NJosh Poimboeuf <jpoimboe@redhat.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Nilay Vaish <nilayvaish@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/789cf978866420e72fa89df44aa2849426ac378d.1474480779.git.jpoimboe@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Josh Poimboeuf 提交于
Thanks to all the recent x86 entry code refactoring, most tasks' kernel stacks start at the same offset right below their saved pt_regs, regardless of which syscall was used to enter the kernel. That creates a nice convention which makes it straightforward to identify the end of the stack, which can be useful for the unwinder to verify the stack is sane. However, the boot CPU's idle "swapper" task doesn't follow that convention. Fix that by starting its stack at a sizeof(pt_regs) offset from the end of the stack page. Signed-off-by: NJosh Poimboeuf <jpoimboe@redhat.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Nilay Vaish <nilayvaish@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/81aee3beb6ed88e44f1bea6986bb7b65c368f77a.1474480779.git.jpoimboe@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Josh Poimboeuf 提交于
The frame at the end of each idle task stack has a zeroed return address. This is inconsistent with real task stacks, which have a real return address at that spot. This inconsistency can be confusing for stack unwinders. It also hides useful information about what asm code was involved in calling into C. Make it a real address by using the side effect of a call instruction to push the instruction pointer on the stack. Signed-off-by: NJosh Poimboeuf <jpoimboe@redhat.com> Reviewed-by: NAndy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Nilay Vaish <nilayvaish@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/f59593ae7b15d5126f872b0a23143173d28aa32d.1474480779.git.jpoimboe@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Josh Poimboeuf 提交于
There are two different pieces of code for starting a CPU: start_cpu0() and the end of secondary_startup_64(). They're identical except for the stack setup. Combine the common parts into a shared start_cpu() function. Signed-off-by: NJosh Poimboeuf <jpoimboe@redhat.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Nilay Vaish <nilayvaish@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1d692ffa62fcb3cc835a5b254e953f2d9bab3549.1474480779.git.jpoimboe@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Josh Poimboeuf 提交于
On 32-bit kernels, the initial idle stack calculation doesn't take into account the TOP_OF_KERNEL_STACK_PADDING, making the stack end address inconsistent with other tasks on 32-bit. Signed-off-by: NJosh Poimboeuf <jpoimboe@redhat.com> Reviewed-by: NAndy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Nilay Vaish <nilayvaish@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/6cf569410bfa84cf923902fc4d628444cace94be.1474480779.git.jpoimboe@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Josh Poimboeuf 提交于
The frame at the end of each idle task stack is inconsistent with real task stacks, which have a stack frame header and a real return address before the pt_regs area. This inconsistency can be confusing for stack unwinders. It also hides useful information about what asm code was involved in calling into C. Fix that by changing the initial code jumps to calls. Also add infinite loops after the calls to make it clear that the calls don't return, and to hang if they do. Signed-off-by: NJosh Poimboeuf <jpoimboe@redhat.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Nilay Vaish <nilayvaish@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/2588f34b6fbac4ae6f6f9ead2a78d7f8d58a6341.1474480779.git.jpoimboe@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Josh Poimboeuf 提交于
Add the local label prefix to all non-function named labels in head_32.S and entry_32.S. In addition to decluttering the symbol table, it also will help stack traces to be more sensible. For example, the last reported function in the idle task stack trace will be startup_32_smp() instead of is486(). Signed-off-by: NJosh Poimboeuf <jpoimboe@redhat.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Nilay Vaish <nilayvaish@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/14f9f7afd478b23a762f40734da1a57c0c273f6e.1474480779.git.jpoimboe@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
- 19 10月, 2016 3 次提交
-
-
由 Piotr Luc 提交于
AVX512_4VNNIW - Vector instructions for deep learning enhanced word variable precision. AVX512_4FMAPS - Vector instructions for deep learning floating-point single precision. These new instructions are to be used in future Intel Xeon & Xeon Phi processors. The bits 2&3 of CPUID[level:0x07, EDX] inform that new instructions are supported by a processor. The spec can be found in the Intel Software Developer Manual (SDM) or in the Instruction Set Extensions Programming Reference (ISE). Define new feature flags to enumerate the new instructions in /proc/cpuinfo accordingly to CPUID bits and add the required xsave extensions which are required for proper operation. Signed-off-by: NPiotr Luc <piotr.luc@intel.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Andy Lutomirski <luto@kernel.org> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: http://lkml.kernel.org/r/20161018150111.29926-1-piotr.luc@intel.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
-
由 Renat Valiullin 提交于
The timer_irq_works() boot check may sometimes fail in a VM, when the Host is overcommitted or when the Guest is running nested. Since the intended check is unnecessary on VMware's virtual hardware, by-pass it. Signed-off-by: NRenat Valiullin <rvaliullin@vmware.com> Acked-by: NAlok N Kataria <akataria@vmware.com> Cc: virtualization@lists.linux-foundation.org Link: http://lkml.kernel.org/r/20161013184539.GA11497@rvaliullin-vmSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
-
由 Lorenzo Stoakes 提交于
This removes the 'write' argument from access_process_vm() and replaces it with 'gup_flags' as use of this function previously silently implied FOLL_FORCE, whereas after this patch callers explicitly pass this flag. We make this explicit as use of FOLL_FORCE can result in surprising behaviour (and hence bugs) within the mm subsystem. Signed-off-by: NLorenzo Stoakes <lstoakes@gmail.com> Acked-by: NJesper Nilsson <jesper.nilsson@axis.com> Acked-by: NMichal Hocko <mhocko@suse.com> Acked-by: NMichael Ellerman <mpe@ellerman.id.au> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 16 10月, 2016 3 次提交
-
-
由 Dan Williams 提交于
Commit: 917db484 ("x86/boot: Fix kdump, cleanup aborted E820_PRAM max_pfn manipulation") ... fixed up the broken manipulations of max_pfn in the presence of E820_PRAM ranges. However, it also broke the sanitize_e820_map() support for not merging E820_PRAM ranges. Re-introduce the enabling to keep resource boundaries between consecutive defined ranges. Otherwise, for example, an environment that boots with memmap=2G!8G,2G!10G will end up with a single 4G /dev/pmem0 device instead of a /dev/pmem0 and /dev/pmem1 device 2G in size. Reported-by: NDave Chinner <david@fromorbit.com> Signed-off-by: NDan Williams <dan.j.williams@intel.com> Cc: <stable@vger.kernel.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Jeff Moyer <jmoyer@redhat.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Zhang Yi <yizhan@redhat.com> Cc: linux-nvdimm@lists.01.org Fixes: 917db484 ("x86/boot: Fix kdump, cleanup aborted E820_PRAM max_pfn manipulation") Link: http://lkml.kernel.org/r/147629530854.10618.10383744751594021268.stgit@dwillia2-desk3.amr.corp.intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Dmitry Vyukov 提交于
I observed false KSAN positives in the sctp code, when sctp uses jprobe_return() in jsctp_sf_eat_sack(). The stray 0xf4 in shadow memory are stack redzones: [ ] ================================================================== [ ] BUG: KASAN: stack-out-of-bounds in memcmp+0xe9/0x150 at addr ffff88005e48f480 [ ] Read of size 1 by task syz-executor/18535 [ ] page:ffffea00017923c0 count:0 mapcount:0 mapping: (null) index:0x0 [ ] flags: 0x1fffc0000000000() [ ] page dumped because: kasan: bad access detected [ ] CPU: 1 PID: 18535 Comm: syz-executor Not tainted 4.8.0+ #28 [ ] Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 [ ] ffff88005e48f2d0 ffffffff82d2b849 ffffffff0bc91e90 fffffbfff10971e8 [ ] ffffed000bc91e90 ffffed000bc91e90 0000000000000001 0000000000000000 [ ] ffff88005e48f480 ffff88005e48f350 ffffffff817d3169 ffff88005e48f370 [ ] Call Trace: [ ] [<ffffffff82d2b849>] dump_stack+0x12e/0x185 [ ] [<ffffffff817d3169>] kasan_report+0x489/0x4b0 [ ] [<ffffffff817d31a9>] __asan_report_load1_noabort+0x19/0x20 [ ] [<ffffffff82d49529>] memcmp+0xe9/0x150 [ ] [<ffffffff82df7486>] depot_save_stack+0x176/0x5c0 [ ] [<ffffffff817d2031>] save_stack+0xb1/0xd0 [ ] [<ffffffff817d27f2>] kasan_slab_free+0x72/0xc0 [ ] [<ffffffff817d05b8>] kfree+0xc8/0x2a0 [ ] [<ffffffff85b03f19>] skb_free_head+0x79/0xb0 [ ] [<ffffffff85b0900a>] skb_release_data+0x37a/0x420 [ ] [<ffffffff85b090ff>] skb_release_all+0x4f/0x60 [ ] [<ffffffff85b11348>] consume_skb+0x138/0x370 [ ] [<ffffffff8676ad7b>] sctp_chunk_put+0xcb/0x180 [ ] [<ffffffff8676ae88>] sctp_chunk_free+0x58/0x70 [ ] [<ffffffff8677fa5f>] sctp_inq_pop+0x68f/0xef0 [ ] [<ffffffff8675ee36>] sctp_assoc_bh_rcv+0xd6/0x4b0 [ ] [<ffffffff8677f2c1>] sctp_inq_push+0x131/0x190 [ ] [<ffffffff867bad69>] sctp_backlog_rcv+0xe9/0xa20 [ ... ] [ ] Memory state around the buggy address: [ ] ffff88005e48f380: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 [ ] ffff88005e48f400: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 [ ] >ffff88005e48f480: f4 f4 00 00 00 00 00 00 00 00 00 00 00 00 00 00 [ ] ^ [ ] ffff88005e48f500: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 [ ] ffff88005e48f580: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 [ ] ================================================================== KASAN stack instrumentation poisons stack redzones on function entry and unpoisons them on function exit. If a function exits abnormally (e.g. with a longjmp like jprobe_return()), stack redzones are left poisoned. Later this leads to random KASAN false reports. Unpoison stack redzones in the frames we are going to jump over before doing actual longjmp in jprobe_return(). Signed-off-by: NDmitry Vyukov <dvyukov@google.com> Acked-by: NMasami Hiramatsu <mhiramat@kernel.org> Reviewed-by: NMark Rutland <mark.rutland@arm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com> Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Cc: Alexander Potapenko <glider@google.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com> Cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: kasan-dev@googlegroups.com Cc: surovegin@google.com Cc: rostedt@goodmis.org Link: http://lkml.kernel.org/r/1476454043-101898-1-git-send-email-dvyukov@google.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Dmitry Vyukov 提交于
Kprobes save and restore raw stack chunks with memcpy(). With KASAN these chunks can contain poisoned stack redzones, as the result memcpy() interceptor produces false stack out-of-bounds reports. Use __memcpy() instead of memcpy() for stack copying. __memcpy() is not instrumented by KASAN and does not lead to the false reports. Currently there is a spew of KASAN reports during boot if CONFIG_KPROBES_SANITY_TEST is enabled: [ ] Kprobe smoke test: started [ ] ================================================================== [ ] BUG: KASAN: stack-out-of-bounds in setjmp_pre_handler+0x17c/0x280 at addr ffff88085259fba8 [ ] Read of size 64 by task swapper/0/1 [ ] page:ffffea00214967c0 count:0 mapcount:0 mapping: (null) index:0x0 [ ] flags: 0x2fffff80000000() [ ] page dumped because: kasan: bad access detected [...] Reported-by: NCAI Qian <caiqian@redhat.com> Tested-by: NCAI Qian <caiqian@redhat.com> Signed-off-by: NDmitry Vyukov <dvyukov@google.com> Acked-by: NMasami Hiramatsu <mhiramat@kernel.org> Cc: Alexander Potapenko <glider@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: David S. Miller <davem@davemloft.net> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: kasan-dev@googlegroups.com [ Improved various details. ] Signed-off-by: NIngo Molnar <mingo@kernel.org>
-
- 14 10月, 2016 1 次提交
-
-
由 Wanpeng Li 提交于
=============================== [ INFO: suspicious RCU usage. ] 4.8.0+ #24 Not tainted ------------------------------- ./arch/x86/include/asm/msr-trace.h:47 suspicious rcu_dereference_check() usage! other info that might help us debug this: RCU used illegally from idle CPU! rcu_scheduler_active = 1, debug_locks = 0 RCU used illegally from extended quiescent state! no locks held by swapper/1/0. [<ffffffff9d492b95>] do_trace_write_msr+0x135/0x140 [<ffffffff9d06f860>] native_write_msr+0x20/0x30 [<ffffffff9d065fad>] native_apic_msr_eoi_write+0x1d/0x30 [<ffffffff9d05bd1d>] smp_reschedule_interrupt+0x1d/0x30 [<ffffffff9d8daec6>] reschedule_interrupt+0x96/0xa0 Reschedule interrupt may be called in cpu idle state. This causes lockdep check warning above. Add irq_enter/exit() in smp_reschedule_interrupt(), irq_enter() tells the RCU subsystems to end the extended quiescent state, so the following trace call in ack_APIC_irq() works correctly. Signed-off-by: NWanpeng Li <wanpeng.li@hotmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Mike Galbraith <efault@gmx.de> Link: http://lkml.kernel.org/r/1476409733-5133-1-git-send-email-wanpeng.li@hotmail.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
-