- 29 8月, 2017 5 次提交
-
-
由 Thomas Gleixner 提交于
Make use of the new irqvector tracing static key and remove the duplicated trace_do_pagefault() implementation. If irq vector tracing is disabled, then the overhead of this is a single NOP5, which is a reasonable tradeoff to avoid duplicated code and the unholy macro mess. Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt <rostedt@goodmis.org> Link: http://lkml.kernel.org/r/20170828064956.672965407@linutronix.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Thomas Gleixner 提交于
Switching the IDT just for avoiding tracepoints creates a completely impenetrable macro/inline/ifdef mess. There is no point in avoiding tracepoints for most of the traps/exceptions. For the more expensive tracepoints, like pagefaults, this can be handled with an explicit static key. Preparatory patch to remove the tracing IDT. Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt <rostedt@goodmis.org> Link: http://lkml.kernel.org/r/20170828064956.593094539@linutronix.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Thomas Gleixner 提交于
Also remove the unparseable comment in the other place while at it. Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt <rostedt@goodmis.org> Link: http://lkml.kernel.org/r/20170828064956.436711634@linutronix.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Thomas Gleixner 提交于
This variable is beyond pointless. Nothing allocates a vector via alloc_gate() below FIRST_SYSTEM_VECTOR. So nothing can change first_system_vector. If there is a need for a gate below FIRST_SYSTEM_VECTOR then it can be added to the vector defines and FIRST_SYSTEM_VECTOR can be adjusted accordingly. Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt <rostedt@goodmis.org> Link: http://lkml.kernel.org/r/20170828064956.357109735@linutronix.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Thomas Gleixner 提交于
Last user (lguest) is gone. Remove it. Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt <rostedt@goodmis.org> Link: http://lkml.kernel.org/r/20170828064956.201432430@linutronix.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
- 25 8月, 2017 3 次提交
-
-
由 Eric Biggers 提交于
The following commit: 39a0526f ("x86/mm: Factor out LDT init from context init") renamed init_new_context() to init_new_context_ldt() and added a new init_new_context() which calls init_new_context_ldt(). However, the error code of init_new_context_ldt() was ignored. Consequently, if a memory allocation in alloc_ldt_struct() failed during a fork(), the ->context.ldt of the new task remained the same as that of the old task (due to the memcpy() in dup_mm()). ldt_struct's are not intended to be shared, so a use-after-free occurred after one task exited. Fix the bug by making init_new_context() pass through the error code of init_new_context_ldt(). This bug was found by syzkaller, which encountered the following splat: BUG: KASAN: use-after-free in free_ldt_struct.part.2+0x10a/0x150 arch/x86/kernel/ldt.c:116 Read of size 4 at addr ffff88006d2cb7c8 by task kworker/u9:0/3710 CPU: 1 PID: 3710 Comm: kworker/u9:0 Not tainted 4.13.0-rc4-next-20170811 #2 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:16 [inline] dump_stack+0x194/0x257 lib/dump_stack.c:52 print_address_description+0x73/0x250 mm/kasan/report.c:252 kasan_report_error mm/kasan/report.c:351 [inline] kasan_report+0x24e/0x340 mm/kasan/report.c:409 __asan_report_load4_noabort+0x14/0x20 mm/kasan/report.c:429 free_ldt_struct.part.2+0x10a/0x150 arch/x86/kernel/ldt.c:116 free_ldt_struct arch/x86/kernel/ldt.c:173 [inline] destroy_context_ldt+0x60/0x80 arch/x86/kernel/ldt.c:171 destroy_context arch/x86/include/asm/mmu_context.h:157 [inline] __mmdrop+0xe9/0x530 kernel/fork.c:889 mmdrop include/linux/sched/mm.h:42 [inline] exec_mmap fs/exec.c:1061 [inline] flush_old_exec+0x173c/0x1ff0 fs/exec.c:1291 load_elf_binary+0x81f/0x4ba0 fs/binfmt_elf.c:855 search_binary_handler+0x142/0x6b0 fs/exec.c:1652 exec_binprm fs/exec.c:1694 [inline] do_execveat_common.isra.33+0x1746/0x22e0 fs/exec.c:1816 do_execve+0x31/0x40 fs/exec.c:1860 call_usermodehelper_exec_async+0x457/0x8f0 kernel/umh.c:100 ret_from_fork+0x2a/0x40 arch/x86/entry/entry_64.S:431 Allocated by task 3700: save_stack_trace+0x16/0x20 arch/x86/kernel/stacktrace.c:59 save_stack+0x43/0xd0 mm/kasan/kasan.c:447 set_track mm/kasan/kasan.c:459 [inline] kasan_kmalloc+0xad/0xe0 mm/kasan/kasan.c:551 kmem_cache_alloc_trace+0x136/0x750 mm/slab.c:3627 kmalloc include/linux/slab.h:493 [inline] alloc_ldt_struct+0x52/0x140 arch/x86/kernel/ldt.c:67 write_ldt+0x7b7/0xab0 arch/x86/kernel/ldt.c:277 sys_modify_ldt+0x1ef/0x240 arch/x86/kernel/ldt.c:307 entry_SYSCALL_64_fastpath+0x1f/0xbe Freed by task 3700: save_stack_trace+0x16/0x20 arch/x86/kernel/stacktrace.c:59 save_stack+0x43/0xd0 mm/kasan/kasan.c:447 set_track mm/kasan/kasan.c:459 [inline] kasan_slab_free+0x71/0xc0 mm/kasan/kasan.c:524 __cache_free mm/slab.c:3503 [inline] kfree+0xca/0x250 mm/slab.c:3820 free_ldt_struct.part.2+0xdd/0x150 arch/x86/kernel/ldt.c:121 free_ldt_struct arch/x86/kernel/ldt.c:173 [inline] destroy_context_ldt+0x60/0x80 arch/x86/kernel/ldt.c:171 destroy_context arch/x86/include/asm/mmu_context.h:157 [inline] __mmdrop+0xe9/0x530 kernel/fork.c:889 mmdrop include/linux/sched/mm.h:42 [inline] __mmput kernel/fork.c:916 [inline] mmput+0x541/0x6e0 kernel/fork.c:927 copy_process.part.36+0x22e1/0x4af0 kernel/fork.c:1931 copy_process kernel/fork.c:1546 [inline] _do_fork+0x1ef/0xfb0 kernel/fork.c:2025 SYSC_clone kernel/fork.c:2135 [inline] SyS_clone+0x37/0x50 kernel/fork.c:2129 do_syscall_64+0x26c/0x8c0 arch/x86/entry/common.c:287 return_from_SYSCALL_64+0x0/0x7a Here is a C reproducer: #include <asm/ldt.h> #include <pthread.h> #include <signal.h> #include <stdlib.h> #include <sys/syscall.h> #include <sys/wait.h> #include <unistd.h> static void *fork_thread(void *_arg) { fork(); } int main(void) { struct user_desc desc = { .entry_number = 8191 }; syscall(__NR_modify_ldt, 1, &desc, sizeof(desc)); for (;;) { if (fork() == 0) { pthread_t t; srand(getpid()); pthread_create(&t, NULL, fork_thread, NULL); usleep(rand() % 10000); syscall(__NR_exit_group, 0); } wait(NULL); } } Note: the reproducer takes advantage of the fact that alloc_ldt_struct() may use vmalloc() to allocate a large ->entries array, and after commit: 5d17a73a ("vmalloc: back off when the current task is killed") it is possible for userspace to fail a task's vmalloc() by sending a fatal signal, e.g. via exit_group(). It would be more difficult to reproduce this bug on kernels without that commit. This bug only affected kernels with CONFIG_MODIFY_LDT_SYSCALL=y. Signed-off-by: NEric Biggers <ebiggers@google.com> Acked-by: NDave Hansen <dave.hansen@linux.intel.com> Cc: <stable@vger.kernel.org> [v4.6+] Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Michal Hocko <mhocko@suse.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@redhat.com> Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-mm@kvack.org Fixes: 39a0526f ("x86/mm: Factor out LDT init from context init") Link: http://lkml.kernel.org/r/20170824175029.76040-1-ebiggers3@gmail.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Paolo Bonzini 提交于
The host pkru is restored right after vcpu exit (commit 1be0e61c), so KVM_GET_XSAVE will return the host PKRU value instead. Fix this by using the guest PKRU explicitly in fill_xsave and load_xsave. This part is based on a patch by Junkang Fu. The host PKRU data may also not match the value in vcpu->arch.guest_fpu.state, because it could have been changed by userspace since the last time it was saved, so skip loading it in kvm_load_guest_fpu. Reported-by: NJunkang Fu <junkang.fjk@alibaba-inc.com> Cc: Yang Zhang <zy107165@alibaba-inc.com> Fixes: 1be0e61c Cc: stable@vger.kernel.org Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Paolo Bonzini 提交于
Move it to struct kvm_arch_vcpu, replacing guest_pkru_valid with a simple comparison against the host value of the register. The write of PKRU in addition can be skipped if the guest has not enabled the feature. Once we do this, we need not test OSPKE in the host anymore, because guest_CR4.PKE=1 implies host_CR4.PKE=1. The static PKU test is kept to elide the code on older CPUs. Suggested-by: NYang Zhang <zy107165@alibaba-inc.com> Fixes: 1be0e61c Cc: stable@vger.kernel.org Reviewed-by: NDavid Hildenbrand <david@redhat.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 24 8月, 2017 1 次提交
-
-
由 Juergen Gross 提交于
Lguest seems to be rather unused these days. It has seen only patches ensuring it still builds the last two years and its official state is "Odd Fixes". Remove it in order to be able to clean up the paravirt code. Signed-off-by: NJuergen Gross <jgross@suse.com> Acked-by: NRusty Russell <rusty@rustcorp.com.au> Acked-by: NThomas Gleixner <tglx@linutronix.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: boris.ostrovsky@oracle.com Cc: lguest@lists.ozlabs.org Cc: rusty@rustcorp.com.au Cc: xen-devel@lists.xenproject.org Link: http://lkml.kernel.org/r/20170816173157.8633-3-jgross@suse.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
- 19 8月, 2017 1 次提交
-
-
由 Kees Cook 提交于
Moving the x86_64 and arm64 PIE base from 0x555555554000 to 0x000100000000 broke AddressSanitizer. This is a partial revert of: eab09532 ("binfmt_elf: use ELF_ET_DYN_BASE only for PIE") 02445990 ("arm64: move ELF_ET_DYN_BASE to 4GB / 4MB") The AddressSanitizer tool has hard-coded expectations about where executable mappings are loaded. The motivation for changing the PIE base in the above commits was to avoid the Stack-Clash CVEs that allowed executable mappings to get too close to heap and stack. This was mainly a problem on 32-bit, but the 64-bit bases were moved too, in an effort to proactively protect those systems (proofs of concept do exist that show 64-bit collisions, but other recent changes to fix stack accounting and setuid behaviors will minimize the impact). The new 32-bit PIE base is fine for ASan (since it matches the ET_EXEC base), so only the 64-bit PIE base needs to be reverted to let x86 and arm64 ASan binaries run again. Future changes to the 64-bit PIE base on these architectures can be made optional once a more dynamic method for dealing with AddressSanitizer is found. (e.g. always loading PIE into the mmap region for marked binaries.) Link: http://lkml.kernel.org/r/20170807201542.GA21271@beast Fixes: eab09532 ("binfmt_elf: use ELF_ET_DYN_BASE only for PIE") Fixes: 02445990 ("arm64: move ELF_ET_DYN_BASE to 4GB / 4MB") Signed-off-by: NKees Cook <keescook@chromium.org> Reported-by: NKostya Serebryany <kcc@google.com> Acked-by: NWill Deacon <will.deacon@arm.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: <stable@vger.kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 11 8月, 2017 2 次提交
-
-
由 Juergen Gross 提交于
Provide a hook in hypervisor_x86 called after setting up initial memory mapping. This is needed e.g. by Xen HVM guests to map the hypervisor shared info page. Signed-off-by: NJuergen Gross <jgross@suse.com> Reviewed-by: NBoris Ostrovsky <boris.ostrovsky@oracle.com> Acked-by: NIngo Molnar <mingo@kernel.org> Signed-off-by: NJuergen Gross <jgross@suse.com>
-
由 Borislav Petkov 提交于
"virtual_vmload_vmsave" is what is going to land in /proc/cpuinfo now as per v4.13-rc4, for a single feature bit which is clearly too long. So rename it to what it is called in the processor manual. "v_vmsave_vmload" is a bit shorter, after all. We could go more aggressively here but having it the same as in the processor manual is advantageous. Signed-off-by: NBorislav Petkov <bp@suse.de> Acked-by: NRadim Krčmář <rkrcmar@redhat.com> Cc: Janakarajan Natarajan <Janakarajan.Natarajan@amd.com> Cc: Jörg Rödel <joro@8bytes.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: kvm-ML <kvm@vger.kernel.org> Link: http://lkml.kernel.org/r/20170801185552.GA3743@nazgul.tnicSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
- 10 8月, 2017 3 次提交
-
-
由 Andy Lutomirski 提交于
In ELF_COPY_CORE_REGS, we're copying from the current task, so accessing thread.fsbase and thread.gsbase makes no sense. Just read the values from the CPU registers. In practice, the old code would have been correct most of the time simply because thread.fsbase and thread.gsbase usually matched the CPU registers. Signed-off-by: NAndy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Borislav Petkov <bpetkov@suse.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Chang Seok <chang.seok.bae@intel.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: stable@vger.kernel.org Signed-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Josh Poimboeuf 提交于
Apparently the binutils 2.20 assembler can't handle the '&&' operator in the UNWIND_HINT_REGS macro. Rearrange the macro to do without it. This fixes the following error: arch/x86/entry/entry_64.S: Assembler messages: arch/x86/entry/entry_64.S:521: Error: non-constant expression in ".if" statement arch/x86/entry/entry_64.S:521: Error: non-constant expression in ".if" statement arch/x86/entry/entry_64.S:521: Error: non-constant expression in ".if" statement arch/x86/entry/entry_64.S:521: Error: non-constant expression in ".if" statement arch/x86/entry/entry_64.S:521: Error: non-constant expression in ".if" statement arch/x86/entry/entry_64.S:521: Error: non-constant expression in ".if" statement arch/x86/entry/entry_64.S:521: Error: non-constant expression in ".if" statement arch/x86/entry/entry_64.S:521: Error: non-constant expression in ".if" statement arch/x86/entry/entry_64.S:521: Error: non-constant expression in ".if" statement Reported-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NJosh Poimboeuf <jpoimboe@redhat.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Fixes: 39358a03 ("objtool, x86: Add facility for asm code to provide unwind hints") Link: http://lkml.kernel.org/r/e2ad97c1ae49a484644b4aaa4dd3faa4d6d969b2.1502116651.git.jpoimboe@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Andy Lutomirski 提交于
The segment register high words on x86_32 may contain garbage. Teach regs_get_register() to read them as u16 instead of unsigned long. Signed-off-by: NAndy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/0b76f6dbe477b7b1a81938fddcc3c483d48f0ff2.1502314765.git.luto@kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
- 30 7月, 2017 2 次提交
-
-
由 Andy Lutomirski 提交于
Now that pt_regs properly defines segment fields as 16-bit on 32-bit CPUs, there's no need to mask off the high word. Signed-off-by: NAndy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Borislav Petkov <bpetkov@suse.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Andy Lutomirski 提交于
Many 32-bit x86 CPUs do 16-bit writes when storing segment registers to memory. This can cause the high word of regs->[cdefgs]s to occasionally contain garbage. Rather than making the entry code more complicated to fix up the garbage, just change pt_regs to reflect reality. Signed-off-by: NAndy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Borislav Petkov <bpetkov@suse.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: NIngo Molnar <mingo@kernel.org>
-
- 27 7月, 2017 1 次提交
-
-
由 Wincy Van 提交于
We are using the same vector for nested/non-nested posted interrupts delivery, this may cause interrupts latency in L1 since we can't kick the L2 vcpu out of vmx-nonroot mode. This patch introduces a new vector which is only for nested posted interrupts to solve the problems above. Signed-off-by: NWincy Van <fanwenyi0529@gmail.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 26 7月, 2017 2 次提交
-
-
由 Josh Poimboeuf 提交于
There are three mutually exclusive unwinders. Make that more obvious by combining them into a multiple-choice selection: CONFIG_FRAME_POINTER_UNWINDER CONFIG_ORC_UNWINDER CONFIG_GUESS_UNWINDER (if CONFIG_EXPERT=y) Frame pointers are still the default (for now). The old CONFIG_FRAME_POINTER option is still used in some arch-independent places, so keep it around, but make it invisible to the user on x86 - it's now selected by CONFIG_FRAME_POINTER_UNWINDER=y. Suggested-by: NIngo Molnar <mingo@kernel.org> Signed-off-by: NJosh Poimboeuf <jpoimboe@redhat.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Jiri Slaby <jslaby@suse.cz> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: live-patching@vger.kernel.org Link: http://lkml.kernel.org/r/20170725135424.zukjmgpz3plf5pmt@trebleSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Josh Poimboeuf 提交于
Add the new ORC unwinder which is enabled by CONFIG_ORC_UNWINDER=y. It plugs into the existing x86 unwinder framework. It relies on objtool to generate the needed .orc_unwind and .orc_unwind_ip sections. For more details on why ORC is used instead of DWARF, see Documentation/x86/orc-unwinder.txt - but the short version is that it's a simplified, fundamentally more robust debugninfo data structure, which also allows up to two orders of magnitude faster lookups than the DWARF unwinder - which matters to profiling workloads like perf. Thanks to Andy Lutomirski for the performance improvement ideas: splitting the ORC unwind table into two parallel arrays and creating a fast lookup table to search a subset of the unwind table. Signed-off-by: NJosh Poimboeuf <jpoimboe@redhat.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Jiri Slaby <jslaby@suse.cz> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: live-patching@vger.kernel.org Link: http://lkml.kernel.org/r/0a6cbfb40f8da99b7a45a1a8302dc6aef16ec812.1500938583.git.jpoimboe@redhat.com [ Extended the changelog. ] Signed-off-by: NIngo Molnar <mingo@kernel.org>
-
- 25 7月, 2017 1 次提交
-
-
由 Kees Cook 提交于
The coming x86 refcount protection needs to be able to add trailing instructions to the GEN_*_RMWcc() operations. This extracts the difference between the goto/non-goto cases so the helper macros can be defined outside the #ifdef cases. Additionally adds argument naming to the resulting asm for referencing from suffixed instructions, and adds clobbers for "cc", and "cx" to let suffixes use _ASM_CX, and retain any set flags. Signed-off-by: NKees Cook <keescook@chromium.org> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Christoph Hellwig <hch@infradead.org> Cc: David S. Miller <davem@davemloft.net> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: Elena Reshetova <elena.reshetova@intel.com> Cc: Eric Biggers <ebiggers3@gmail.com> Cc: Eric W. Biederman <ebiederm@xmission.com> Cc: Greg KH <gregkh@linuxfoundation.org> Cc: Hans Liljestrand <ishkamiel@gmail.com> Cc: James Bottomley <James.Bottomley@hansenpartnership.com> Cc: Jann Horn <jannh@google.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Manfred Spraul <manfred@colorfullife.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@redhat.com> Cc: Serge E. Hallyn <serge@hallyn.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: arozansk@redhat.com Cc: axboe@kernel.dk Cc: kernel-hardening@lists.openwall.com Cc: linux-arch <linux-arch@vger.kernel.org> Link: http://lkml.kernel.org/r/1500921349-10803-2-git-send-email-keescook@chromium.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
- 24 7月, 2017 5 次提交
-
-
由 Andy Shevchenko 提交于
Despite the following commit: 93093d09 ("x86: provide readq()/writeq() on 32-bit too, complete") which says: ...Also, map all the APIs to the strongest ordering variant. It's way too easy to mess such details up in drivers and the difference between "memory" and "" constrained asm() constructs is in the noise range. ... we have for now only one user of this API (i.e. writeq_relaxed() in drivers/hwtracing/intel_th/sth.c) on x86 and it does care about "relaxed" part of it. Moreover 32-bit support has been removed from that header, though appeared later in specific headers that emphasizes its non-atomic context. The rest should keep in mind a consistent picture of the __raw_IO() vs. IO() vs. IO_relaxed() API. Signed-off-by: NAndy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: Baolin Wang <baolin.wang@spreadtrum.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mika Westerberg <mika.westerberg@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: intel-gfx@lists.freedesktop.org Cc: linux-i2c@vger.kernel.org Cc: wsa@the-dreams.de Link: http://lkml.kernel.org/r/20170630170934.83028-6-andriy.shevchenko@linux.intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Andy Shevchenko 提交于
Generic header defines xlate_dev_kmem_ptr(). Reuse it from generic header and remove in x86 code. Move a description to the generic header as well. Signed-off-by: NAndy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: Baolin Wang <baolin.wang@spreadtrum.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mika Westerberg <mika.westerberg@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: intel-gfx@lists.freedesktop.org Cc: linux-i2c@vger.kernel.org Cc: wsa@the-dreams.de Link: http://lkml.kernel.org/r/20170630170934.83028-5-andriy.shevchenko@linux.intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Andy Shevchenko 提交于
Generic header defines memset_io, memcpy_fromio(). and memcpy_toio(). Reuse them from generic header and remove in x86 code. Move the descriptions to the generic header as well. Signed-off-by: NAndy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: Baolin Wang <baolin.wang@spreadtrum.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mika Westerberg <mika.westerberg@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: intel-gfx@lists.freedesktop.org Cc: linux-i2c@vger.kernel.org Cc: wsa@the-dreams.de Link: http://lkml.kernel.org/r/20170630170934.83028-4-andriy.shevchenko@linux.intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Andy Shevchenko 提交于
asm-generic/io.h defines few helpers which would be useful in the drivers, such as writesb() and readsb(). Include it to the asm/io.h in architectural folder. Signed-off-by: NAndy Shevchenko <andriy.shevchenko@linux.intel.com> Acked-by: NWolfram Sang <wsa@the-dreams.de> Cc: Baolin Wang <baolin.wang@spreadtrum.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mika Westerberg <mika.westerberg@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: intel-gfx@lists.freedesktop.org Cc: linux-i2c@vger.kernel.org Link: http://lkml.kernel.org/r/20170630170934.83028-3-andriy.shevchenko@linux.intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Andy Shevchenko 提交于
As a preparatory to use generic IO accessor helpers we need to define architecture dependent functions via preprocessor to let world know we have them. Signed-off-by: NAndy Shevchenko <andriy.shevchenko@linux.intel.com> Acked-by: NWolfram Sang <wsa@the-dreams.de> Cc: Baolin Wang <baolin.wang@spreadtrum.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mika Westerberg <mika.westerberg@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: intel-gfx@lists.freedesktop.org Cc: linux-i2c@vger.kernel.org Link: http://lkml.kernel.org/r/20170630170934.83028-2-andriy.shevchenko@linux.intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
- 21 7月, 2017 1 次提交
-
-
由 Linus Torvalds 提交于
They really are, and the "take the address of a single character" makes the string fortification code unhappy (it believes that you can now only acccess one byte, rather than a byte range, and then raises errors for the memory copies going on in there). We could now remove a few 'addressof' operators (since arrays naturally degrade to pointers), but this is the minimal patch that just changes the C prototypes of those template arrays (the templates themselves are defined in inline asm). Reported-by: Nkernel test robot <xiaolong.ye@intel.com> Acked-and-tested-by: NMasami Hiramatsu <mhiramat@kernel.org> Cc: Daniel Micay <danielmicay@gmail.com> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 20 7月, 2017 2 次提交
-
-
由 Josh Poimboeuf 提交于
Mike Galbraith reported a situation where a WARN_ON_ONCE() call in DRM code turned into an oops. As it turns out, WARN_ON_ONCE() seems to be completely broken when called from a module. The bug was introduced with the following commit: 19d43626 ("debug: Add _ONCE() logic to report_bug()") That commit changed WARN_ON_ONCE() to move its 'once' logic into the bug trap handler. It requires a writable bug table so that the BUGFLAG_DONE bit can be written to the flags to indicate the first warning has occurred. The bug table was made writable for vmlinux, which relies on vmlinux.lds.S and vmlinux.lds.h for laying out the sections. However, it wasn't made writable for modules, which rely on the ELF section header flags. Reported-by: NMike Galbraith <efault@gmx.de> Tested-by: NMasami Hiramatsu <mhiramat@kernel.org> Signed-off-by: NJosh Poimboeuf <jpoimboe@redhat.com> Acked-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Fixes: 19d43626 ("debug: Add _ONCE() logic to report_bug()") Link: http://lkml.kernel.org/r/a53b04235a65478dd9afc51f5b329fdc65c84364.1500095401.git.jpoimboe@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Arnd Bergmann 提交于
The x86 version of insb/insw/insl uses an inline assembly that does not have the target buffer listed as an output. This can confuse the compiler, leading it to think that a subsequent access of the buffer is uninitialized: drivers/net/wireless/wl3501_cs.c: In function ‘wl3501_mgmt_scan_confirm’: drivers/net/wireless/wl3501_cs.c:665:9: error: ‘sig.status’ is used uninitialized in this function [-Werror=uninitialized] drivers/net/wireless/wl3501_cs.c:668:12: error: ‘sig.cap_info’ may be used uninitialized in this function [-Werror=maybe-uninitialized] drivers/net/sb1000.c: In function 'sb1000_rx': drivers/net/sb1000.c:775:9: error: 'st[0]' is used uninitialized in this function [-Werror=uninitialized] drivers/net/sb1000.c:776:10: error: 'st[1]' may be used uninitialized in this function [-Werror=maybe-uninitialized] drivers/net/sb1000.c:784:11: error: 'st[1]' may be used uninitialized in this function [-Werror=maybe-uninitialized] I tried to mark the exact input buffer as an output here, but couldn't figure it out. As suggested by Linus, marking all memory as clobbered however is good enough too. For the outs operations, I also add the memory clobber, to force the input to be written to local variables. This is probably already guaranteed by the "asm volatile", but it can't hurt to do this for symmetry. Suggested-by: NLinus Torvalds <torvalds@linux-foundation.org> Signed-off-by: NArnd Bergmann <arnd@arndb.de> Acked-by: NLinus Torvalds <torvalds@linux-foundation.org> Cc: Borislav Petkov <bp@suse.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tom Lendacky <thomas.lendacky@amd.com> Link: http://lkml.kernel.org/r/20170719125310.2487451-5-arnd@arndb.de Link: https://lkml.org/lkml/2017/7/12/605Signed-off-by: NIngo Molnar <mingo@kernel.org>
-
- 18 7月, 2017 3 次提交
-
-
由 Josh Poimboeuf 提交于
This enables objtool to grok the iret in the middle of a C function. Signed-off-by: NJosh Poimboeuf <jpoimboe@redhat.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Jiri Slaby <jslaby@suse.cz> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: live-patching@vger.kernel.org Link: http://lkml.kernel.org/r/b057be26193c11d2ed3337b2107bc7adcba42c99.1499786555.git.jpoimboe@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Josh Poimboeuf 提交于
Some asm (and inline asm) code does special things to the stack which objtool can't understand. (Nor can GCC or GNU assembler, for that matter.) In such cases we need a facility for the code to provide annotations, so the unwinder can unwind through it. This provides such a facility, in the form of unwind hints. They're similar to the GNU assembler .cfi* directives, but they give more information, and are needed in far fewer places, because objtool can fill in the blanks by following branches and adjusting the stack pointer for pushes and pops. Signed-off-by: NJosh Poimboeuf <jpoimboe@redhat.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Jiri Slaby <jslaby@suse.cz> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: live-patching@vger.kernel.org Link: http://lkml.kernel.org/r/0f5f3c9104fca559ff4088bece1d14ae3bca52d5.1499786555.git.jpoimboe@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Roman Kagan 提交于
A recent commit: d6e41f11 ("x86/mm, KVM: Teach KVM's VMX code that CR3 isn't a constant") introduced a VM_WARN_ON(!in_atomic()) which generates false positives on every VM entry on !CONFIG_PREEMPT_COUNT kernels. Replace it with a test for preemptible(), which appears to match the original intent and works across different CONFIG_PREEMPT* variations. Signed-off-by: NRoman Kagan <rkagan@virtuozzo.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Arjan van de Ven <arjan@linux.intel.com> Cc: Borislav Petkov <bpetkov@suse.de> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mel Gorman <mgorman@suse.de> Cc: Michal Hocko <mhocko@suse.com> Cc: Nadav Amit <nadav.amit@gmail.com> Cc: Nadav Amit <namit@vmware.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Radim Krčmář <rkrcmar@redhat.com> Cc: Rik van Riel <riel@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: kvm@vger.kernel.org Cc: linux-mm@kvack.org Fixes: d6e41f11 ("x86/mm, KVM: Teach KVM's VMX code that CR3 isn't a constant") Signed-off-by: NIngo Molnar <mingo@kernel.org>
-
- 14 7月, 2017 5 次提交
-
-
由 Roman Kagan 提交于
Hyper-V identifies vCPUs by Virtual Processor Index, which can be queried via HV_X64_MSR_VP_INDEX msr. It is defined by the spec as a sequential number which can't exceed the maximum number of vCPUs per VM. APIC ids can be sparse and thus aren't a valid replacement for VP indices. Current KVM uses its internal vcpu index as VP_INDEX. However, to make it predictable and persistent across VM migrations, the userspace has to control the value of VP_INDEX. This patch achieves that, by storing vp_index explicitly on vcpu, and allowing HV_X64_MSR_VP_INDEX to be set from the host side. For compatibility it's initialized to KVM vcpu index. Also a few variables are renamed to make clear distinction betweed this Hyper-V vp_index and KVM vcpu_id (== APIC id). Besides, a new capability, KVM_CAP_HYPERV_VP_INDEX, is added to allow the userspace to skip attempting msr writes where unsupported, to avoid spamming error logs. Signed-off-by: NRoman Kagan <rkagan@virtuozzo.com> Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
-
由 Wanpeng Li 提交于
Adds another flag bit (bit 2) to MSR_KVM_ASYNC_PF_EN. If bit 2 is 1, async page faults are delivered to L1 as #PF vmexits; if bit 2 is 0, kvm_can_do_async_pf returns 0 if in guest mode. This is similar to what svm.c wanted to do all along, but it is only enabled for Linux as L1 hypervisor. Foreign hypervisors must never receive async page faults as vmexits, because they'd probably be very confused about that. Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: NWanpeng Li <wanpeng.li@hotmail.com> Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
-
由 Wanpeng Li 提交于
Add an nested_apf field to vcpu->arch.exception to identify an async page fault, and constructs the expected vm-exit information fields. Force a nested VM exit from nested_vmx_check_exception() if the injected #PF is async page fault. Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: NWanpeng Li <wanpeng.li@hotmail.com> Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
-
由 Wanpeng Li 提交于
This patch adds the L1 guest async page fault #PF vmexit handler, such by L1 similar to ordinary async page fault. Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: NWanpeng Li <wanpeng.li@hotmail.com> [Passed insn parameters to kvm_mmu_page_fault().] Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
-
由 Wanpeng Li 提交于
This patch removes all arguments except the first in kvm_x86_ops->queue_exception since they can extract the arguments from vcpu->arch.exception themselves. Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: NWanpeng Li <wanpeng.li@hotmail.com> Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
-
- 13 7月, 2017 3 次提交
-
-
由 Roman Kagan 提交于
There is a flaw in the Hyper-V SynIC implementation in KVM: when message page or event flags page is enabled by setting the corresponding msr, KVM zeroes it out. This is problematic because on migration the corresponding MSRs are loaded on the destination, so the content of those pages is lost. This went unnoticed so far because the only user of those pages was in-KVM hyperv synic timers, which could continue working despite that zeroing. Newer QEMU uses those pages for Hyper-V VMBus implementation, and zeroing them breaks the migration. Besides, in newer QEMU the content of those pages is fully managed by QEMU, so zeroing them is undesirable even when writing the MSRs from the guest side. To support this new scheme, introduce a new capability, KVM_CAP_HYPERV_SYNIC2, which, when enabled, makes sure that the synic pages aren't zeroed out in KVM. Signed-off-by: NRoman Kagan <rkagan@virtuozzo.com> Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
-
由 Ladi Prosek 提交于
The backwards_tsc_observed global introduced in commit 16a96021 is never reset to false. If a VM happens to be running while the host is suspended (a common source of the TSC jumping backwards), master clock will never be enabled again for any VM. In contrast, if no VM is running while the host is suspended, master clock is unaffected. This is inconsistent and unnecessarily strict. Let's track the backwards_tsc_observed variable separately and let each VM start with a clean slate. Real world impact: My Windows VMs get slower after my laptop undergoes a suspend/resume cycle. The only way to get the perf back is unloading and reloading the kvm module. Signed-off-by: NLadi Prosek <lprosek@redhat.com> Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
-
由 Joe Perches 提交于
Make the code like the rest of the kernel. Link: http://lkml.kernel.org/r/1cd3d401626e51ea0e2333a860e76e80bc560a4c.1499284835.git.joe@perches.comSigned-off-by: NJoe Perches <joe@perches.com> Cc: Matt Fleming <matt@codeblueprint.co.uk> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-