- 05 5月, 2017 3 次提交
-
-
由 Matthias Kaehlcke 提交于
The constraint "rm" allows the compiler to put mix_const into memory. When the input operand is a memory location then MUL needs an operand size suffix, since Clang can't infer the multiplication width from the operand. Add and use the _ASM_MUL macro which determines the operand size and resolves to the NUL instruction with the corresponding suffix. This fixes the following error when building with clang: CC arch/x86/lib/kaslr.o /tmp/kaslr-dfe1ad.s: Assembler messages: /tmp/kaslr-dfe1ad.s:182: Error: no instruction mnemonic suffix given and no register operands; can't size instruction Signed-off-by: NMatthias Kaehlcke <mka@chromium.org> Cc: Grant Grundler <grundler@chromium.org> Cc: Greg Hackmann <ghackmann@google.com> Cc: Kees Cook <keescook@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Michael Davidson <md@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20170501224741.133938-1-mka@chromium.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Baoquan He 提交于
Jeff Moyer reported that on his system with two memory regions 0~64G and 1T~1T+192G, and kernel option "memmap=192G!1024G" added, enabling KASLR will make the system hang intermittently during boot. While adding 'nokaslr' won't. The back trace is: Oops: 0000 [#1] SMP RIP: memcpy_erms() [ .... ] Call Trace: pmem_rw_page() bdev_read_page() do_mpage_readpage() mpage_readpages() blkdev_readpages() __do_page_cache_readahead() force_page_cache_readahead() page_cache_sync_readahead() generic_file_read_iter() blkdev_read_iter() __vfs_read() vfs_read() SyS_read() entry_SYSCALL_64_fastpath() This crash happens because the for loop count calculation in sync_global_pgds() is not correct. When a mapping area crosses PGD entries, we should calculate the starting address of region which next PGD covers and assign it to next for loop count, but not add PGDIR_SIZE directly. The old code works right only if the mapping area is an exact multiple of PGDIR_SIZE, otherwize the end region could be skipped so that it can't be synchronized to all other processes from kernel PGD init_mm.pgd. In Jeff's system, emulated pmem area [1024G, 1216G) is smaller than PGDIR_SIZE. While 'nokaslr' works because PAGE_OFFSET is 1T aligned, it makes this area be mapped inside one PGD entry. With KASLR enabled, this area could cross two PGD entries, then the next PGD entry won't be synced to all other processes. That is why we saw empty PGD. Fix it. Reported-by: NJeff Moyer <jmoyer@redhat.com> Signed-off-by: NBaoquan He <bhe@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Dave Young <dyoung@redhat.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Jinbum Park <jinb.park7@gmail.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Kees Cook <keescook@chromium.org> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Garnier <thgarnie@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Yasuaki Ishimatsu <yasu.isimatu@gmail.com> Cc: Yinghai Lu <yinghai@kernel.org> Link: http://lkml.kernel.org/r/1493864747-8506-1-git-send-email-bhe@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Josh Poimboeuf 提交于
Andrey Konovalov reported the following warning while fuzzing the kernel with syzkaller: WARNING: kernel stack regs at ffff8800686869f8 in a.out:4933 has bad 'bp' value c3fc855a10167ec0 The unwinder dump revealed that RBP had a bad value when an interrupt occurred in csum_partial_copy_generic(). That function saves RBP on the stack and then overwrites it, using it as a scratch register. That's problematic because it breaks stack traces if an interrupt occurs in the middle of the function. Replace the usage of RBP with another callee-saved register (R15) so stack traces are no longer affected. Reported-by: NAndrey Konovalov <andreyknvl@google.com> Tested-by: NAndrey Konovalov <andreyknvl@google.com> Signed-off-by: NJosh Poimboeuf <jpoimboe@redhat.com> Cc: Cong Wang <xiyou.wangcong@gmail.com> Cc: David S . Miller <davem@davemloft.net> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Eric Dumazet <edumazet@google.com> Cc: Kostya Serebryany <kcc@google.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Cc: Neil Horman <nhorman@tuxdriver.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vlad Yasevich <vyasevich@gmail.com> Cc: linux-sctp@vger.kernel.org Cc: netdev <netdev@vger.kernel.org> Cc: syzkaller <syzkaller@googlegroups.com> Link: http://lkml.kernel.org/r/4b03a961efda5ec9bfe46b7b9c9ad72d1efad343.1493909486.git.jpoimboe@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
- 02 5月, 2017 1 次提交
-
-
由 Sergei Trofimovich 提交于
Starting from gcc-5.4+ gcc generates MLX instructions in more cases to refer local symbols: https://gcc.gnu.org/PR60465 That caused ia64 module loader to choke on such instructions: fuse: invalid slot number 1 for IMM64 The Linux kernel used to handle only case where relocation pointed to slot=2 instruction in the bundle. That limitation was fixed in linux by commit 9c184a07 ("[IA64] Fix 2.6 kernel for the new ia64 assembler") See http://sources.redhat.com/bugzilla/show_bug.cgi?id=1433 This change lifts the slot=2 restriction from the kernel module loader. Tested on 'fuse' and 'btrfs' kernel modules. Cc: Markus Elfring <elfring@users.sourceforge.net> Cc: H J Lu <hjl.tools@gmail.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: Andrew Morton <akpm@linux-foundation.org> Bug: https://bugs.gentoo.org/601014Tested-by: NÉmeric MASCHINO <emeric.maschino@gmail.com> Signed-off-by: NSergei Trofimovich <slyfox@gentoo.org> Signed-off-by: NTony Luck <tony.luck@intel.com> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 01 5月, 2017 1 次提交
-
-
This patch drops support for AVR32 architecture from the Linux kernel. The AVR32 architecture is not keeping up with the development of the kernel, and since it shares so much of the drivers with Atmel ARM SoC, it is starting to hinder these drivers to develop swiftly. Also, all AVR32 AP7 SoC processors are end of lifed from Atmel (now Microchip). Finally, the GCC toolchain is stuck at version 4.2.x, and has not received any patches since the last release from Atmel; 4.2.4-atmel.1.1.3.avr32linux.1. When building kernel v4.10, this toolchain is no longer able to properly link the network stack. Haavard and I have came to the conclusion that we feel keeping AVR32 on life support offers more obstacles for Atmel ARMs, than it gives joy to AVR32 users. I also suspect there are very few AVR32 users left today, if anybody at all. Signed-off-by: NHans-Christian Noren Egtvedt <egtvedt@samfundet.no> Signed-off-by: NHåvard Skinnemoen <hskinnemoen@gmail.com> Signed-off-by: NNicolas Ferre <nicolas.ferre@microchip.com> Acked-by: NAndy Shevchenko <andy.shevchenko@gmail.com> Acked-by: NBoris Brezillon <boris.brezillon@free-electrons.com>
-
- 30 4月, 2017 1 次提交
-
-
由 Janakarajan Natarajan 提交于
Newer hardware has uncovered a bug in the software implementation of using MWAITX for the delay function. A value of 0 for the timer is meant to indicate that a timeout will not be used to exit MWAITX. On newer hardware this can result in MWAITX never returning, resulting in NMI soft lockup messages being printed. On older hardware, some of the other conditions under which MWAITX can exit masked this issue. The AMD APM does not currently document this and will be updated. Please refer to http://marc.info/?l=kvm&m=148950623231140 for information regarding NMI soft lockup messages on an AMD Ryzen 1800X. This has been root-caused as a 0 passed to MWAITX causing it to wait indefinitely. This change has the added benefit of avoiding the unnecessary setup of MONITORX/MWAITX when the delay value is zero. Signed-off-by: NJanakarajan Natarajan <Janakarajan.Natarajan@amd.com> Link: http://lkml.kernel.org/r/1493156643-29366-1-git-send-email-Janakarajan.Natarajan@amd.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
-
- 28 4月, 2017 1 次提交
-
-
由 Baoquan He 提交于
Dave found that a kdump kernel with KASLR enabled will reset to the BIOS immediately if physical randomization failed to find a new position for the kernel. A kernel with the 'nokaslr' option works in this case. The reason is that KASLR will install a new page table for the identity mapping, while it missed building it for the original kernel location if KASLR physical randomization fails. This only happens in the kexec/kdump kernel, because the identity mapping has been built for kexec/kdump in the 1st kernel for the whole memory by calling init_pgtable(). Here if physical randomizaiton fails, it won't build the identity mapping for the original area of the kernel but change to a new page table '_pgtable'. Then the kernel will triple fault immediately caused by no identity mappings. The normal kernel won't see this bug, because it comes here via startup_32() and CR3 will be set to _pgtable already. In startup_32() the identity mapping is built for the 0~4G area. In KASLR we just append to the existing area instead of entirely overwriting it for on-demand identity mapping building. So the identity mapping for the original area of kernel is still there. To fix it we just switch to the new identity mapping page table when physical KASLR succeeds. Otherwise we keep the old page table unchanged just like "nokaslr" does. Signed-off-by: NBaoquan He <bhe@redhat.com> Signed-off-by: NDave Young <dyoung@redhat.com> Acked-by: NKees Cook <keescook@chromium.org> Cc: Borislav Petkov <bp@suse.de> Cc: Dave Jiang <dave.jiang@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Garnier <thgarnie@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Yinghai Lu <yinghai@kernel.org> Link: http://lkml.kernel.org/r/1493278940-5885-1-git-send-email-bhe@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
- 27 4月, 2017 3 次提交
-
-
由 Al Viro 提交于
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
由 Al Viro 提交于
all architectures converted Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
由 Al Viro 提交于
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
- 26 4月, 2017 7 次提交
-
-
由 Andy Lutomirski 提交于
flush_tlb_page() passes a bogus range to flush_tlb_others() and expects the latter to fix it up. native_flush_tlb_others() has the fixup but Xen's version doesn't. Move the fixup to flush_tlb_others(). AFAICS the only real effect is that, without this fix, Xen would flush everything instead of just the one page on remote vCPUs in when flush_tlb_page() was called. Signed-off-by: NAndy Lutomirski <luto@kernel.org> Reviewed-by: NBoris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Juergen Gross <jgross@suse.com> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Michal Hocko <mhocko@suse.com> Cc: Nadav Amit <namit@vmware.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Fixes: e7b52ffd ("x86/flush_tlb: try flush_tlb_single one by one in flush_tlb_range") Link: http://lkml.kernel.org/r/10ed0e4dfea64daef10b87fb85df1746999b4dba.1492844372.git.luto@kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Andy Lutomirski 提交于
I'm about to rewrite the function almost completely, but first I want to get a functional change out of the way. Currently, if flush_tlb_mm_range() does not flush the local TLB at all, it will never do individual page flushes on remote CPUs. This seems to be an accident, and preserving it will be awkward. Let's change it first so that any regressions in the rewrite will be easier to bisect and so that the rewrite can attempt to change no visible behavior at all. The fix is simple: we can simply avoid short-circuiting the calculation of base_pages_to_flush. As a side effect, this also eliminates a potential corner case: if tlb_single_page_flush_ceiling == TLB_FLUSH_ALL, flush_tlb_mm_range() could have ended up flushing the entire address space one page at a time. Signed-off-by: NAndy Lutomirski <luto@kernel.org> Acked-by: NDave Hansen <dave.hansen@intel.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Michal Hocko <mhocko@suse.com> Cc: Nadav Amit <namit@vmware.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/4b29b771d9975aad7154c314534fec235618175a.1492844372.git.luto@kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Andy Lutomirski 提交于
I was trying to figure out what how flush_tlb_current_task() would possibly work correctly if current->mm != current->active_mm, but I realized I could spare myself the effort: it has no callers except the unused flush_tlb() macro. Signed-off-by: NAndy Lutomirski <luto@kernel.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Michal Hocko <mhocko@suse.com> Cc: Nadav Amit <namit@vmware.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/e52d64c11690f85e9f1d69d7b48cc2269cd2e94b.1492844372.git.luto@kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Andy Lutomirski 提交于
mark_screen_rdonly() is the last remaining caller of flush_tlb(). flush_tlb_mm_range() is potentially faster and isn't obsolete. Compile-tested only because I don't know whether software that uses this mechanism even exists. Signed-off-by: NAndy Lutomirski <luto@kernel.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Michal Hocko <mhocko@suse.com> Cc: Nadav Amit <namit@vmware.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@redhat.com> Cc: Sasha Levin <sasha.levin@oracle.com> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/791a644076fc3577ba7f7b7cafd643cc089baa7d.1492844372.git.luto@kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Kirill A. Shutemov 提交于
remove_pagetable() does page walk using p*d_page_vaddr() plus cast. It's not canonical approach -- we usually use p*d_offset() for that. It works fine as long as all page table levels are present. We broke the invariant by introducing folded p4d page table level. As result, remove_pagetable() interprets PMD as PUD and it leads to crash: BUG: unable to handle kernel paging request at ffff880300000000 IP: memchr_inv+0x60/0x110 PGD 317d067 P4D 317d067 PUD 3180067 PMD 33f102067 PTE 8000000300000060 Let's fix this by using p*d_offset() instead of p*d_page_vaddr() for page walk. Reported-by: NDan Williams <dan.j.williams@intel.com> Tested-by: NDan Williams <dan.j.williams@intel.com> Signed-off-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-mm@kvack.org Fixes: f2a6a705 ("x86: Convert the rest of the code to support p4d_t") Link: http://lkml.kernel.org/r/20170425092557.21852-1-kirill.shutemov@linux.intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Josh Poimboeuf 提交于
Currently unwind_dump() dumps only the most recently accessed stack. But it has a few issues. In some cases, 'first_sp' can get out of sync with 'stack_info', causing unwind_dump() to start from the wrong address, flood the printk buffer, and eventually read a bad address. In other cases, dumping only the most recently accessed stack doesn't give enough data to diagnose the error. Fix both issues by dumping *all* stacks involved in the trace, not just the last one. Signed-off-by: NJosh Poimboeuf <jpoimboe@redhat.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Fixes: 8b5e99f0 ("x86/unwind: Dump stack data on warnings") Link: http://lkml.kernel.org/r/016d6a9810d7d1bfc87ef8c0e6ee041c6744c909.1493171120.git.jpoimboe@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Josh Poimboeuf 提交于
Borislav Petkov reported the following unwinder warning: WARNING: kernel stack regs at ffffc9000024fea8 in udevadm:92 has bad 'bp' value 00007fffc4614d30 unwind stack type:0 next_sp: (null) mask:0x6 graph_idx:0 ffffc9000024fea8: 000055a6100e9b38 (0x55a6100e9b38) ffffc9000024feb0: 000055a6100e9b35 (0x55a6100e9b35) ffffc9000024feb8: 000055a6100e9f68 (0x55a6100e9f68) ffffc9000024fec0: 000055a6100e9f50 (0x55a6100e9f50) ffffc9000024fec8: 00007fffc4614d30 (0x7fffc4614d30) ffffc9000024fed0: 000055a6100eaf50 (0x55a6100eaf50) ffffc9000024fed8: 0000000000000000 ... ffffc9000024fee0: 0000000000000100 (0x100) ffffc9000024fee8: ffff8801187df488 (0xffff8801187df488) ffffc9000024fef0: 00007ffffffff000 (0x7ffffffff000) ffffc9000024fef8: 0000000000000000 ... ffffc9000024ff10: ffffc9000024fe98 (0xffffc9000024fe98) ffffc9000024ff18: 00007fffc4614d00 (0x7fffc4614d00) ffffc9000024ff20: ffffffffffffff10 (0xffffffffffffff10) ffffc9000024ff28: ffffffff811c6c1f (SyS_newlstat+0xf/0x10) ffffc9000024ff30: 0000000000000010 (0x10) ffffc9000024ff38: 0000000000000296 (0x296) ffffc9000024ff40: ffffc9000024ff50 (0xffffc9000024ff50) ffffc9000024ff48: 0000000000000018 (0x18) ffffc9000024ff50: ffffffff816b2e6a (entry_SYSCALL_64_fastpath+0x18/0xa8) ... It unwinded from an interrupt which came in right after entry code called into a C syscall handler, before it had a chance to set up the frame pointer, so regs->bp still had its user space value. Add a check to silence warnings in such a case, where an interrupt has occurred and regs->sp is almost at the end of the stack. Reported-by: NBorislav Petkov <bp@suse.de> Tested-by: NBorislav Petkov <bp@suse.de> Signed-off-by: NJosh Poimboeuf <jpoimboe@redhat.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Fixes: c32c47c6 ("x86/unwind: Warn on bad frame pointer") Link: http://lkml.kernel.org/r/c695f0d0d4c2cfe6542b90e2d0520e11eb901eb5.1493171120.git.jpoimboe@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
- 25 4月, 2017 1 次提交
-
-
由 Al Viro 提交于
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
- 24 4月, 2017 2 次提交
-
-
由 David S. Miller 提交于
Hook up statx. Ignore pkeys system calls, we don't have protection keeys on SPARC. Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 David S. Miller 提交于
This lets us enable KPROBE_EVENTS. Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 23 4月, 2017 1 次提交
-
-
由 Ingo Molnar 提交于
This reverts commit 2947ba05. Dan Williams reported dax-pmem kernel warnings with the following signature: WARNING: CPU: 8 PID: 245 at lib/percpu-refcount.c:155 percpu_ref_switch_to_atomic_rcu+0x1f5/0x200 percpu ref (dax_pmem_percpu_release [dax_pmem]) <= 0 (0) after switching to atomic ... and bisected it to this commit, which suggests possible memory corruption caused by the x86 fast-GUP conversion. He also pointed out: " This is similar to the backtrace when we were not properly handling pud faults and was fixed with this commit: 220ced16 "mm: fix get_user_pages() vs device-dax pud mappings" I've found some missing _devmap checks in the generic get_user_pages_fast() path, but this does not fix the regression [...] " So given that there are known bugs, and a pretty robust looking bisection points to this commit suggesting that are unknown bugs in the conversion as well, revert it for the time being - we'll re-try in v4.13. Reported-by: NDan Williams <dan.j.williams@intel.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Michal Hocko <mhocko@suse.cz> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: aneesh.kumar@linux.vnet.ibm.com Cc: dann.frazier@canonical.com Cc: dave.hansen@intel.com Cc: steve.capper@linaro.org Cc: linux-kernel@vger.kernel.org Signed-off-by: NIngo Molnar <mingo@kernel.org>
-
- 21 4月, 2017 2 次提交
-
-
由 Steven Rostedt (VMware) 提交于
Fengguang Wu's zero day bot triggered a stack unwinder dump. This can be easily triggered when CONFIG_FRAME_POINTERS is enabled and -mfentry is in use on x86_32. ># cd /sys/kernel/debug/tracing ># echo 'p:schedule schedule' > kprobe_events ># echo stacktrace > events/kprobes/schedule/trigger This is because the code that implemented fentry in the ftrace_regs_caller tried to use the least amount of #ifdefs, and modified ebp when CC_USE_FENTRY was defined to point to the parent ip as it does when CC_USE_FENTRY is not defined. But when CONFIG_FRAME_POINTERS is set, it corrupts the ebp register for this frame while doing the tracing. NOTE, it does not corrupt ebp in any other way. It is just a bad frame pointer when calling into the tracing infrastructure. The original ebp is restored before returning from the fentry call. But if a stack trace is performed inside the tracing, the unwinder will notice the bad ebp. Instead of toying with ebp with CC_USING_FENTRY, just slap the parent ip into the second parameter (%edx), and have an #else that does it the original way. The unwinder will unfortunately miss the function being traced, as the stack frame is not set up yet for it, as it is for x86_64. But fixing that is a bit more complex and did not work before anyway. This has been tested with and without FRAME_POINTERS being set while using -mfentry, as well as using an older compiler that uses mcount. Analyzed-by: NJosh Poimboeuf <jpoimboe@redhat.com> Fixes: 644e0e8d ("x86/ftrace: Add -mfentry support to x86_32 with DYNAMIC_FTRACE set") Reported-by: Nkernel test robot <fengguang.wu@intel.com> Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Link: https://lists.01.org/pipermail/lkp/2017-April/006165.html Link: http://lkml.kernel.org/r/20170420172236.7af7f6e5@gandalf.local.homeSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
-
由 Vineet Gupta 提交于
Accumulator is present in configs with FPU and/or DSP MPY (mpy > 6) Instead of doing this in pt_regs (and thus every kernel entry/exit), this could have been done in context switch (and for user task only) as currently kernel doesn't clobber these registers for its own accord. However we will soon start using 64-bit multiply instructions for kernel which can clobber these. Also gcc folks also plan to start using these as GPRs, hence better to always save/restore them Signed-off-by: NVineet Gupta <vgupta@synopsys.com>
-
- 20 4月, 2017 8 次提交
-
-
由 Vikas Shivappa 提交于
When schemata parses the resource names it does not return an error if it detects incorrect resource names and fails quietly. This happens because for_each_enabled_rdt_resource(r) leaves "r" pointing beyond the end of the rdt_resources_all[] array, and the check for !r->name results in an out of bounds access. Split the resource parsing part into a helper function to avoid the issue. [ tglx: Made it readable by splitting the parser loop out into a function ] Reported-by: NPrakhya, Sai Praneeth <sai.praneeth.prakhya@intel.com> Signed-off-by: NVikas Shivappa <vikas.shivappa@linux.intel.com> Tested-by: NPrakhya, Sai Praneeth <sai.praneeth.prakhya@intel.com> Cc: fenghua.yu@intel.com Cc: tony.luck@intel.com Cc: ravi.v.shankar@intel.com Cc: vikas.shivappa@intel.com Link: http://lkml.kernel.org/r/1492645804-17465-4-git-send-email-vikas.shivappa@linux.intel.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
-
由 Vikas Shivappa 提交于
Schemata is displayed in tabular format which introduces some whitespace to show data in a tabular format. Writing back the same data fails as the parser does not handle the whitespace. Trim the leading and trailing whitespace before parsing. Reported-by: NPrakhya, Sai Praneeth <sai.praneeth.prakhya@intel.com> Signed-off-by: NVikas Shivappa <vikas.shivappa@linux.intel.com> Tested-by: NPrakhya, Sai Praneeth <sai.praneeth.prakhya@intel.com> Cc: fenghua.yu@intel.com Cc: tony.luck@intel.com Cc: ravi.v.shankar@intel.com Cc: vikas.shivappa@intel.com Link: http://lkml.kernel.org/r/1492645804-17465-3-git-send-email-vikas.shivappa@linux.intel.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
-
由 Vikas Shivappa 提交于
Currently max width of 'resource name' and 'resource data' is being initialized based on 'enabled resources' during boot. But the mount can enable different capable resources at a later time which upsets the tabular format of schemata. Fix this to be based on 'all capable' resources. Signed-off-by: NVikas Shivappa <vikas.shivappa@linux.intel.com> Tested-by: NPrakhya, Sai Praneeth <sai.praneeth.prakhya@intel.com> Cc: fenghua.yu@intel.com Cc: tony.luck@intel.com Cc: ravi.v.shankar@intel.com Cc: vikas.shivappa@intel.com Link: http://lkml.kernel.org/r/1492645804-17465-2-git-send-email-vikas.shivappa@linux.intel.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
-
由 Chen Yu 提交于
Before offlining a CPU its required to check whether there are enough free irq vectors available so interrupts can be migrated away from the CPU. This check is executed whether its required or not and neither stops searching when the number of required free vectors are reached. Bypass the free vector check if the current CPU has no irq to migrate and leave the for_each_online_cpu() loop when the free vector count reaches the number of required vectors. Suggested-by: NThomas Gleixner <tglx@linutronix.de> Signed-off-by: NChen Yu <yu.c.chen@intel.com> Cc: Prarit Bhargava <prarit@redhat.com> Cc: Len Brown <lenb@kernel.orq> Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net> Link: http://lkml.kernel.org/r/1492357410-510-1-git-send-email-yu.c.chen@intel.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
-
由 Matt Redfearn 提交于
The Malta platform is the only platform remaining to probe the GIC clocksource via gic_clocksource_init. This route hardcodes an expected virq number based on MIPS_GIC_IRQ_BASE, which can be fragile to the eventual virq layout. Instread, probe the driver using the preferred and more modern devicetree method. Before the driver is probed, set the "clock-frequency" property of the devicetree node to the value detected by Malta platform code. Signed-off-by: NMatt Redfearn <matt.redfearn@imgtec.com> Cc: linux-mips@linux-mips.org Cc: James Hogan <james.hogan@imgtec.com> Cc: Daniel Lezcano <daniel.lezcano@linaro.org> Cc: Ralf Baechle <ralf@linux-mips.org> Link: http://lkml.kernel.org/r/1492604806-23420-1-git-send-email-matt.redfearn@imgtec.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
-
由 Prarit Bhargava 提交于
Modify the reschedule warning to output the offline CPU number and use a better debug message. Signed-off-by: NPrarit Bhargava <prarit@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Daniel Bristot de Oliveira <bristot@redhat.com> Cc: Hidehiro Kawai <hidehiro.kawai.ez@hitachi.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt (VMware) <rostedt@goodmis.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Wanpeng Li <wanpeng.li@hotmail.com> Link: http://lkml.kernel.org/r/1492518305-3808-1-git-send-email-prarit@redhat.com [ Tweaked the warning message. ] Signed-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Tiantian Feng 提交于
A CPU in VMX root mode will ignore INIT signals and will fail to bring up the APs after reboot. Therefore, on a panic we disable VMX on all CPUs before rebooting or triggering kdump. Do this when halting the machine as well, in case a firmware-level reboot does not perform a cold reset for all processors. Without doing this, rebooting the host may hang. Signed-off-by: NTiantian Feng <fengtiantian@huawei.com> Signed-off-by: NXishi Qiu <qiuxishi@huawei.com> [ Rewritten commit message. ] Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: kvm@vger.kernel.org Link: http://lkml.kernel.org/r/20170419161839.30550-1-pbonzini@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Ashish Kalra 提交于
The minimum size for a new stack (512 bytes) setup for arch/x86/boot components when the bootloader does not setup/provide a stack for the early boot components is not "enough". The setup code executing as part of early kernel startup code, uses the stack beyond 512 bytes and accidentally overwrites and corrupts part of the BSS section. This is exposed mostly in the early video setup code, where it was corrupting BSS variables like force_x, force_y, which in-turn affected kernel parameters such as screen_info (screen_info.orig_video_cols) and later caused an exception/panic in console_init(). Most recent boot loaders setup the stack for early boot components, so this stack overwriting into BSS section issue has not been exposed. Signed-off-by: NAshish Kalra <ashish@bluestacks.com> Cc: <stable@vger.kernel.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20170419152015.10011-1-ashishkalra@Ashishs-MacBook-Pro.localSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
- 19 4月, 2017 9 次提交
-
-
由 Fu Wei 提交于
This patch adds support for parsing arch timer info in GTDT, provides some kernel APIs to parse all the PPIs and always-on info in GTDT and export them. By this driver, we can simplify arm_arch_timer drivers, and separate the ACPI GTDT knowledge from it. Signed-off-by: NFu Wei <fu.wei@linaro.org> Signed-off-by: NHanjun Guo <hanjun.guo@linaro.org> Acked-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com> Tested-by: NXiongfeng Wang <wangxiongfeng2@huawei.com> Reviewed-by: NHanjun Guo <hanjun.guo@linaro.org> Tested-by: NHanjun Guo <hanjun.guo@linaro.org> Acked-by: NLorenzo Pieralisi <lorenzo.pieralisi@arm.com> Signed-off-by: NMark Rutland <mark.rutland@arm.com>
-
由 Borislav Petkov 提交于
mce_usable_address() does a bunch of basic sanity checks to verify whether the address reported with the error is usable for further processing. However, we do check MCi_STATUS[MISCV] and that is not needed on AMD as that bit says that there's additional information about the logged error in the MCi_MISCj banks. But we don't need that to know whether the address is usable - we only need to know whether the physical address is valid - i.e., ADDRV. On Intel the MISCV bit is needed to perform additional checks to determine whether the reported address is a physical one, etc. Signed-off-by: NBorislav Petkov <bp@suse.de> Cc: Yazen Ghannam <yazen.ghannam@amd.com> Cc: Tony Luck <tony.luck@intel.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/20170418183924.6agjkebilwqj26or@pd.tnicSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
-
由 Josh Poimboeuf 提交于
The 'sp' parameter to unwind_dump() is unused. Remove it. Signed-off-by: NJosh Poimboeuf <jpoimboe@redhat.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/08cb36b004629f6bbcf44c267ae4a609242ebd0b.1492520933.git.jpoimboe@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Josh Poimboeuf 提交于
In unwind_dump(), the stack mask value is printed in hex, but is confusingly not prepended with '0x'. Signed-off-by: NJosh Poimboeuf <jpoimboe@redhat.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/e7fe41be19d73c9f99f53082486473febfe08ffa.1492520933.git.jpoimboe@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Josh Poimboeuf 提交于
On x86-32, 32-bit stack values printed by unwind_dump() are confusingly zero-padded to 16 characters (64 bits): unwind stack type:0 next_sp: (null) mask:a graph_idx:0 f50cdebc: 00000000f50cdec4 (0xf50cdec4) f50cdec0: 00000000c40489b7 (irq_exit+0x87/0xa0) ... Instead, base the field width on the size of a long integer so that it looks right on both x86-32 and x86-64. x86-32: unwind stack type:1 next_sp: (null) mask:0x2 graph_idx:0 c0ee9d98: c0ee9de0 (init_thread_union+0x1de0/0x2000) c0ee9d9c: c043fd90 (__save_stack_trace+0x50/0xe0) ... x86-64: unwind stack type:1 next_sp: (null) mask:0x2 graph_idx:0 ffffffff81e03b88: ffffffff81e03c10 (init_thread_union+0x3c10/0x4000) ffffffff81e03b90: ffffffff81048f8e (__save_stack_trace+0x5e/0x100) ... Reported-by: NH. Peter Anvin <hpa@zytor.com> Signed-off-by: NJosh Poimboeuf <jpoimboe@redhat.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/36b743812e7eb291d74af4e5067736736622daad.1492520933.git.jpoimboe@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Josh Poimboeuf 提交于
For pre-4.6.0 versions of GCC, which don't have '-mfentry', the '-maccumulate-outgoing-args' option is required for function graph tracing in order to avoid GCC bug 42109. However, GCC ignores '-maccumulate-outgoing-args' when '-Os' is also set. Currently we force a build error to prevent that scenario, but that breaks randconfigs. So change the error to a warning which also disables CONFIG_CC_OPTIMIZE_FOR_SIZE. Reported-by: NAndi Kleen <andi@firstfloor.org> Signed-off-by: NJosh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: kbuild test robot <fengguang.wu@intel.com> Cc: kbuild-all@01.org Link: http://lkml.kernel.org/r/20170418214429.o7fbwbmf4nqosezy@trebleSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Vishal Verma 提交于
The NFIT MCE handler callback (for handling media errors on NVDIMMs) takes a mutex to add the location of a memory error to a list. But since the notifier call chain for machine checks (x86_mce_decoder_chain) is atomic, we get a lockdep splat like: BUG: sleeping function called from invalid context at kernel/locking/mutex.c:620 in_atomic(): 1, irqs_disabled(): 0, pid: 4, name: kworker/0:0 [..] Call Trace: dump_stack ___might_sleep __might_sleep mutex_lock_nested ? __lock_acquire nfit_handle_mce notifier_call_chain atomic_notifier_call_chain ? atomic_notifier_call_chain mce_gen_pool_process Convert the notifier to a blocking one which gets to run only in process context. Boris: remove the notifier call in atomic context in print_mce(). For now, let's print the MCE on the atomic path so that we can make sure they go out and get logged at least. Fixes: 6839a6d9 ("nfit: do an ARS scrub on hitting a latent media error") Reported-by: NRoss Zwisler <ross.zwisler@linux.intel.com> Signed-off-by: NVishal Verma <vishal.l.verma@intel.com> Acked-by: NTony Luck <tony.luck@intel.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: linux-edac <linux-edac@vger.kernel.org> Cc: x86-ml <x86@kernel.org> Cc: <stable@vger.kernel.org> Link: http://lkml.kernel.org/r/20170411224457.24777-1-vishal.l.verma@intel.comSigned-off-by: NBorislav Petkov <bp@suse.de> Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
-
由 Nitin Gupta 提交于
Make sure the start adderess is aligned to PMD_SIZE boundary when freeing page table backing a hugepage region. The issue was causing segfaults when a region backed by 64K pages was unmapped since such a region is in general not PMD_SIZE aligned. Signed-off-by: NNitin Gupta <nitin.m.gupta@oracle.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Daniel Jordan 提交于
CONFIG_PROVE_LOCKING_SMALL shrinks the memory usage of lockdep so the kernel text, data, and bss fit in the required 32MB limit, but this option is not set for every config that enables lockdep. A 4.10 kernel fails to boot with the console output Kernel: Using 8 locked TLB entries for main kernel image. hypervisor_tlb_lock[2000000:0:8000000071c007c3:1]: errors with f Program terminated with these config options CONFIG_LOCKDEP=y CONFIG_LOCK_STAT=y CONFIG_PROVE_LOCKING=n To fix, rename CONFIG_PROVE_LOCKING_SMALL to CONFIG_LOCKDEP_SMALL, and enable this option with CONFIG_LOCKDEP=y so we get the reduced memory usage every time lockdep is turned on. Tested that CONFIG_LOCKDEP_SMALL is set to 'y' if and only if CONFIG_LOCKDEP is set to 'y'. When other lockdep-related config options that select CONFIG_LOCKDEP are enabled (e.g. CONFIG_LOCK_STAT or CONFIG_PROVE_LOCKING), verified that CONFIG_LOCKDEP_SMALL is also enabled. Fixes: e6b5f1be ("config: Adding the new config parameter CONFIG_PROVE_LOCKING_SMALL for sparc") Signed-off-by: NDaniel Jordan <daniel.m.jordan@oracle.com> Reviewed-by: NBabu Moger <babu.moger@oracle.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-