- 19 5月, 2015 1 次提交
-
-
由 Ingo Molnar 提交于
Fix a minor header file dependency bug in asm/fpu-internal.h: it relies on i387.h but does not include it. All users of fpu-internal.h included it explicitly. Also remove unnecessary includes, to reduce compilation time. This also makes it easier to use it as a standalone header file for FPU internals, such as an upcoming C module in arch/x86/kernel/fpu/. Reviewed-by: NBorislav Petkov <bp@alien8.de> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: NIngo Molnar <mingo@kernel.org>
-
- 27 4月, 2015 1 次提交
-
-
由 Andy Lutomirski 提交于
AMD CPUs don't reinitialize the SS descriptor on SYSRET, so SYSRET with SS == 0 results in an invalid usermode state in which SS is apparently equal to __USER_DS but causes #SS if used. Work around the issue by setting SS to __KERNEL_DS __switch_to, thus ensuring that SYSRET never happens with SS set to NULL. This was exposed by a recent vDSO cleanup. Fixes: e7d6eefa x86/vdso32/syscall.S: Do not load __USER32_DS to %ss Signed-off-by: NAndy Lutomirski <luto@kernel.org> Cc: Peter Anvin <hpa@zytor.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Denys Vlasenko <vda.linux@googlemail.com> Cc: Brian Gerst <brgerst@gmail.com> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 09 4月, 2015 1 次提交
-
-
由 Denys Vlasenko 提交于
The change which affected how execve clears EXTRA_REGS missed 32-bit execve syscalls. Fix this by using 64-bit execve stub epilogue for them too. Run-tested. Signed-off-by: NDenys Vlasenko <dvlasenk@redhat.com> Cc: Alexei Starovoitov <ast@plumgrid.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Kees Cook <keescook@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Will Drewry <wad@chromium.org> Link: http://lkml.kernel.org/r/1428439424-7258-3-git-send-email-dvlasenk@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
- 06 4月, 2015 1 次提交
-
-
由 Brian Gerst 提交于
The 'pax' argument is unnecesary. Instead, store the RAX value directly in regs. This pattern goes all the way back to 2.1.106pre1, when restore_sigcontext() was changed to return an error code instead of EAX directly: https://git.kernel.org/cgit/linux/kernel/git/history/history.git/diff/arch/i386/kernel/signal.c?id=9a8f8b7ca3f319bd668298d447bdf32730e51174 In 2007 sigaltstack syscall support was added, where the return value of restore_sigcontext() was changed to carry the memory-copying failure code. But instead of putting 'ax' into regs->ax directly, it was carried in via a pointer and then returned, where the generic syscall return code copied it to regs->ax. So there was never any deeper reason for this suboptimal pattern, it was simply never noticed after being introduced. Signed-off-by: NBrian Gerst <brgerst@gmail.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1428152303-17154-1-git-send-email-brgerst@gmail.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
- 03 4月, 2015 1 次提交
-
-
由 Andy Lutomirski 提交于
SYSEXIT is scary on 64-bit kernels -- SYSEXIT must be invoked with usergs and IRQs on. That means that we rely on STI to correctly mask interrupts for one instruction. This is okay by itself, but the semantics with respect to NMIs are unclear. Avoid the whole issue by using SYSRETL instead. For background, Intel CPUs don't allow SYSCALL from compat mode, but they do allow SYSRETL back to compat mode. Go figure. To avoid doing too much at once, this doesn't revamp the calling convention. We still return with EBP, EDX, and ECX on the user stack. Oddly this seems to be 30 cycles or so faster. Avoiding POPFQ and STI will account for under half of that, I think, so my best guess is that Intel just optimizes SYSRET much better than SYSEXIT. Signed-off-by: NAndy Lutomirski <luto@kernel.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Denys Vlasenko <vda.linux@googlemail.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/57a0bf1b5230b2716a64ebe48e9bc1110f7ab433.1428019097.git.luto@kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
- 01 4月, 2015 1 次提交
-
-
由 Denys Vlasenko 提交于
This mimics the recent similar 64-bit change. Saves ~110 bytes of code. Patch was run-tested on 32 and 64 bits, Intel and AMD CPU. I also looked at the diff of entry_64.o disassembly, to have a different view of the changes. Signed-off-by: NDenys Vlasenko <dvlasenk@redhat.com> Cc: Alexei Starovoitov <ast@plumgrid.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Kees Cook <keescook@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Will Drewry <wad@chromium.org> Link: http://lkml.kernel.org/r/1427821211-25099-2-git-send-email-dvlasenk@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
- 27 3月, 2015 2 次提交
-
-
由 Denys Vlasenko 提交于
There are a couple of syscall argument zero-extension instructions in the 32-bit compat entry code, and it was mentioned that people keep trying to optimize them out, introducing bugs. Make them more visible, and add a "do not remove" comment. Signed-off-by: NDenys Vlasenko <dvlasenk@redhat.com> Cc: Alexei Starovoitov <ast@plumgrid.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Kees Cook <keescook@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Will Drewry <wad@chromium.org> Link: http://lkml.kernel.org/r/1427452582-21624-3-git-send-email-dvlasenk@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Denys Vlasenko 提交于
The existing comment has proven to be not very clear. Replace it with a comment similar to the one we now have in the 64-bit syscall entry point. (Three instances, one per 32-bit syscall entry). In the INT80 entry point's CFI annotations, replace mysterious expressions with numric constants. In this case, raw numbers look more understandable. Signed-off-by: NDenys Vlasenko <dvlasenk@redhat.com> Cc: Alexei Starovoitov <ast@plumgrid.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Kees Cook <keescook@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Will Drewry <wad@chromium.org> Link: http://lkml.kernel.org/r/1427452582-21624-2-git-send-email-dvlasenk@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
- 25 3月, 2015 4 次提交
-
-
由 Ingo Molnar 提交于
The THREAD_INFO() macro has a somewhat confusingly generic name, defined in a generic .h C header file. It also does not make it clear that it constructs a memory operand for use in assembly code. Rename it to ASM_THREAD_INFO() to make it all glaringly obvious on first glance. Acked-by: NBorislav Petkov <bp@suse.de> Cc: Alexei Starovoitov <ast@plumgrid.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Kees Cook <keescook@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Will Drewry <wad@chromium.org> Link: http://lkml.kernel.org/r/20150324184442.GC14760@gmail.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Ingo Molnar 提交于
Before: TI_sysenter_return+THREAD_INFO(%rsp,3*8),%r10d After: movl THREAD_INFO(TI_sysenter_return, %rsp, 3*8), %r10d to turn it into a clear thread_info accessor. No code changed: md5: fb4cb2b3ce05d89940ca304efc8ff183 ia32entry.o.before.asm fb4cb2b3ce05d89940ca304efc8ff183 ia32entry.o.after.asm e39f2958a5d1300158e276e4f7663263 entry_64.o.before.asm e39f2958a5d1300158e276e4f7663263 entry_64.o.after.asm Acked-by: NAndy Lutomirski <luto@kernel.org> Acked-by: NDenys Vlasenko <dvlasenk@redhat.com> Cc: Alexei Starovoitov <ast@plumgrid.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Borislav Petkov <bp@suse.de> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Kees Cook <keescook@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Will Drewry <wad@chromium.org> Link: http://lkml.kernel.org/r/20150324184411.GB14760@gmail.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Denys Vlasenko 提交于
PER_CPU_VAR(kernel_stack) was set up in a way where it points five stack slots below the top of stack. Presumably, it was done to avoid one "sub $5*8,%rsp" in syscall/sysenter code paths, where iret frame needs to be created by hand. Ironically, none of them benefits from this optimization, since all of them need to allocate additional data on stack (struct pt_regs), so they still have to perform subtraction. This patch eliminates KERNEL_STACK_OFFSET. PER_CPU_VAR(kernel_stack) now points directly to top of stack. pt_regs allocations are adjusted to allocate iret frame as well. Hopefully we can merge it later with 32-bit specific PER_CPU_VAR(cpu_current_top_of_stack) variable... Net result in generated code is that constants in several insns are changed. This change is necessary for changing struct pt_regs creation in SYSCALL64 code path from MOV to PUSH instructions. Signed-off-by: NDenys Vlasenko <dvlasenk@redhat.com> Acked-by: NBorislav Petkov <bp@suse.de> Acked-by: NAndy Lutomirski <luto@kernel.org> Cc: Alexei Starovoitov <ast@plumgrid.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Kees Cook <keescook@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Will Drewry <wad@chromium.org> Link: http://lkml.kernel.org/r/1426785469-15125-2-git-send-email-dvlasenk@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Denys Vlasenko 提交于
This changes the THREAD_INFO() definition and all its callsites so that they do not count stack position from (top of stack - KERNEL_STACK_OFFSET), but from top of stack. Semi-mysterious expressions THREAD_INFO(%rsp,RIP) - "why RIP??" are now replaced by more logical THREAD_INFO(%rsp,SIZEOF_PTREGS) - "calculate thread_info's address using information that rsp is SIZEOF_PTREGS bytes below top of stack". While at it, replace "(off)-THREAD_SIZE(reg)" with equivalent "((off)-THREAD_SIZE)(reg)". The form without parentheses falsely looks like we invoke THREAD_SIZE() macro. Improve comment atop THREAD_INFO macro definition. This patch does not change generated code (verified by objdump). Signed-off-by: NDenys Vlasenko <dvlasenk@redhat.com> Acked-by: NBorislav Petkov <bp@suse.de> Acked-by: NAndy Lutomirski <luto@kernel.org> Cc: Alexei Starovoitov <ast@plumgrid.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Kees Cook <keescook@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Will Drewry <wad@chromium.org> Link: http://lkml.kernel.org/r/1426785469-15125-1-git-send-email-dvlasenk@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
- 23 3月, 2015 1 次提交
-
-
由 Brian Gerst 提交于
Both the execve() and sigreturn() family of syscalls have the ability to change registers in ways that may not be compatabile with the syscall path they were called from. In particular, SYSRET and SYSEXIT can't handle non-default %cs and %ss, and some bits in eflags. These syscalls have stubs that are hardcoded to jump to the IRET path, and not return to the original syscall path. The following commit: 76f5df43 ("Always allocate a complete "struct pt_regs" on the kernel stack") recently changed this for some 32-bit compat syscalls, but introduced a bug where execve from a 32-bit program to a 64-bit program would fail because it still returned via SYSRETL. This caused Wine to fail when built for both 32-bit and 64-bit. This patch sets TIF_NOTIFY_RESUME for execve() and sigreturn() so that the IRET path is always taken on exit to userspace. Signed-off-by: NBrian Gerst <brgerst@gmail.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: http://lkml.kernel.org/r/1426978461-32089-1-git-send-email-brgerst@gmail.com [ Improved the changelog and comments. ] Signed-off-by: NIngo Molnar <mingo@kernel.org>
-
- 06 3月, 2015 2 次提交
-
-
由 Andy Lutomirski 提交于
It has nothing to do with init -- there's only one TSS per cpu. Other names considered include: - current_tss: Confusing because we never switch the tss. - singleton_tss: Too long. This patch was generated with 's/init_tss/cpu_tss/g'. Followup patches will fix INIT_TSS and INIT_TSS_IST by hand. Signed-off-by: NAndy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/da29fb2a793e4f649d93ce2d1ed320ebe8516262.1425611534.git.luto@amacapital.netSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Andy Lutomirski 提交于
The ia32 sysenter code loaded the top of the kernel stack into rsp by loading kernel_stack and then adjusting it. It can be simplified to just read sp0 directly. This requires the addition of a new asm-offsets entry for sp0. Signed-off-by: NAndy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/88ff9006163d296a0665338585c36d9bfb85235d.1425611534.git.luto@amacapital.netSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
- 05 3月, 2015 5 次提交
-
-
由 Denys Vlasenko 提交于
The last instance of "mysterious" SS+8 constant is replaced by SIZEOF_PTREGS. Message-Id: <1424822419-10267-1-git-send-email-dvlasenk@redhat.com> Signed-off-by: NDenys Vlasenko <dvlasenk@redhat.com> Signed-off-by: NAndy Lutomirski <luto@amacapital.net> Cc: Alexei Starovoitov <ast@plumgrid.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Kees Cook <keescook@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Will Drewry <wad@chromium.org> Link: http://lkml.kernel.org/r/d35aeba3059407ac54f472ddcfbea767ff8916ac.1424989793.git.luto@amacapital.netSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Denys Vlasenko 提交于
Use of a small macro - one with conditional expansion - does more harm than good. It obfuscates code, with minimal code reuse. For example, because of obfuscation it's not obvious that in 'ia32_sysenter_target', we can optimize loading of r9 - currently it is loaded with a detour through ebp. This patch folds the IA32_ARG_FIXUP macro into its callers. No code changes. Signed-off-by: NDenys Vlasenko <dvlasenk@redhat.com> Signed-off-by: NAndy Lutomirski <luto@amacapital.net> Cc: Alexei Starovoitov <ast@plumgrid.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Kees Cook <keescook@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Will Drewry <wad@chromium.org> Link: http://lkml.kernel.org/r/4da092094cd78734384ac31e0d4ec1d8f69145a2.1424989793.git.luto@amacapital.netSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Denys Vlasenko 提交于
SYSCALL/SYSRET and SYSENTER/SYSEXIT have weird semantics. Moreover, they differ in 32- and 64-bit mode. What is saved? What is not? Is rsp set? Are interrupts disabled? People tend to not remember these details well enough. This patch adds comments which explain in detail what registers are modified by each of these instructions. The comments are placed immediately before corresponding entry and exit points. Signed-off-by: NDenys Vlasenko <dvlasenk@redhat.com> Signed-off-by: NAndy Lutomirski <luto@amacapital.net> Cc: Alexei Starovoitov <ast@plumgrid.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Kees Cook <keescook@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Will Drewry <wad@chromium.org> Link: http://lkml.kernel.org/r/a94b98b63527797c871a81402ff5060b18fa880a.1424989793.git.luto@amacapital.netSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Denys Vlasenko 提交于
ARGOFFSET is zero now, removing it changes no code. A few macros lost "offset" parameter, since it is always zero now too. No code changes - verified with objdump. Signed-off-by: NDenys Vlasenko <dvlasenk@redhat.com> Signed-off-by: NAndy Lutomirski <luto@amacapital.net> Cc: Alexei Starovoitov <ast@plumgrid.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Kees Cook <keescook@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Will Drewry <wad@chromium.org> Link: http://lkml.kernel.org/r/8689f937622d9d2db0ab8be82331fa15e4ed4713.1424989793.git.luto@amacapital.netSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Denys Vlasenko 提交于
The 64-bit entry code was using six stack slots less by not saving/restoring registers which are callee-preserved according to the C ABI, and was not allocating space for them. Only when syscalls needed a complete "struct pt_regs" was the complete area allocated and filled in. As an additional twist, on interrupt entry a "slightly less truncated pt_regs" trick is used, to make nested interrupt stacks easier to unwind. This proved to be a source of significant obfuscation and subtle bugs. For example, 'stub_fork' had to pop the return address, extend the struct, save registers, and push return address back. Ugly. 'ia32_ptregs_common' pops return address and "returns" via jmp insn, throwing a wrench into CPU return stack cache. This patch changes the code to always allocate a complete "struct pt_regs" on the kernel stack. The saving of registers is still done lazily. "Partial pt_regs" trick on interrupt stack is retained. Macros which manipulate "struct pt_regs" on stack are reworked: - ALLOC_PT_GPREGS_ON_STACK allocates the structure. - SAVE_C_REGS saves to it those registers which are clobbered by C code. - SAVE_EXTRA_REGS saves to it all other registers. - Corresponding RESTORE_* and REMOVE_PT_GPREGS_FROM_STACK macros reverse it. 'ia32_ptregs_common', 'stub_fork' and friends lost their ugly dance with the return pointer. LOAD_ARGS32 in ia32entry.S now uses symbolic stack offsets instead of magic numbers. 'error_entry' and 'save_paranoid' now use SAVE_C_REGS + SAVE_EXTRA_REGS instead of having it open-coded yet again. Patch was run-tested: 64-bit executables, 32-bit executables, strace works. Timing tests did not show measurable difference in 32-bit and 64-bit syscalls. Signed-off-by: NDenys Vlasenko <dvlasenk@redhat.com> Signed-off-by: NAndy Lutomirski <luto@amacapital.net> Cc: Alexei Starovoitov <ast@plumgrid.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Kees Cook <keescook@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Will Drewry <wad@chromium.org> Link: http://lkml.kernel.org/r/1423778052-21038-2-git-send-email-dvlasenk@redhat.com Link: http://lkml.kernel.org/r/b89763d354aa23e670b9bdf3a40ae320320a7c2e.1424989793.git.luto@amacapital.netSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
- 04 3月, 2015 3 次提交
-
-
由 Brian Gerst 提交于
The check against lastcomm is racy, and the message it produces isn't necessary. vm86 support can be disabled on a 32-bit kernel also, and doesn't have this message. Switch to sys_ni_syscall instead. Signed-off-by: NBrian Gerst <brgerst@gmail.com> Cc: Borislav Petkov <bp@alien8.de> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1425439896-8322-4-git-send-email-brgerst@gmail.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Brian Gerst 提交于
Combine the 32-bit syscall tables into one file. Signed-off-by: NBrian Gerst <brgerst@gmail.com> Cc: Borislav Petkov <bp@alien8.de> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1425439896-8322-3-git-send-email-brgerst@gmail.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Brian Gerst 提交于
compat_ni_syscall() does the same thing as sys_ni_syscall(). Signed-off-by: NBrian Gerst <brgerst@gmail.com> Cc: Borislav Petkov <bp@alien8.de> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1425439896-8322-2-git-send-email-brgerst@gmail.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
- 13 2月, 2015 1 次提交
-
-
由 Andy Lutomirski 提交于
If an attacker can cause a controlled kernel stack overflow, overwriting the restart block is a very juicy exploit target. This is because the restart_block is held in the same memory allocation as the kernel stack. Moving the restart block to struct task_struct prevents this exploit by making the restart_block harder to locate. Note that there are other fields in thread_info that are also easy targets, at least on some architectures. It's also a decent simplification, since the restart code is more or less identical on all architectures. [james.hogan@imgtec.com: metag: align thread_info::supervisor_stack] Signed-off-by: NAndy Lutomirski <luto@amacapital.net> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Kees Cook <keescook@chromium.org> Cc: David Miller <davem@davemloft.net> Acked-by: NRichard Weinberger <richard@nod.at> Cc: Richard Henderson <rth@twiddle.net> Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Cc: Matt Turner <mattst88@gmail.com> Cc: Vineet Gupta <vgupta@synopsys.com> Cc: Russell King <rmk@arm.linux.org.uk> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Haavard Skinnemoen <hskinnemoen@gmail.com> Cc: Hans-Christian Egtvedt <egtvedt@samfundet.no> Cc: Steven Miao <realmz6@gmail.com> Cc: Mark Salter <msalter@redhat.com> Cc: Aurelien Jacquiot <a-jacquiot@ti.com> Cc: Mikael Starvik <starvik@axis.com> Cc: Jesper Nilsson <jesper.nilsson@axis.com> Cc: David Howells <dhowells@redhat.com> Cc: Richard Kuo <rkuo@codeaurora.org> Cc: "Luck, Tony" <tony.luck@intel.com> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Michal Simek <monstr@monstr.eu> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Jonas Bonn <jonas@southpole.se> Cc: "James E.J. Bottomley" <jejb@parisc-linux.org> Cc: Helge Deller <deller@gmx.de> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Acked-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc) Tested-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc) Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Chen Liqin <liqin.linux@gmail.com> Cc: Lennox Wu <lennox.wu@gmail.com> Cc: Chris Metcalf <cmetcalf@ezchip.com> Cc: Guan Xuetao <gxt@mprc.pku.edu.cn> Cc: Chris Zankel <chris@zankel.net> Cc: Max Filippov <jcmvbkbc@gmail.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Guenter Roeck <linux@roeck-us.net> Signed-off-by: NJames Hogan <james.hogan@imgtec.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 14 1月, 2015 1 次提交
-
-
由 Denys Vlasenko 提交于
The values of these two constants are the same, the meaning is different. Acked-by: NBorislav Petkov <bp@suse.de> CC: Linus Torvalds <torvalds@linux-foundation.org> CC: Oleg Nesterov <oleg@redhat.com> CC: "H. Peter Anvin" <hpa@zytor.com> CC: Borislav Petkov <bp@alien8.de> CC: Frederic Weisbecker <fweisbec@gmail.com> CC: X86 ML <x86@kernel.org> CC: Alexei Starovoitov <ast@plumgrid.com> CC: Will Drewry <wad@chromium.org> CC: Kees Cook <keescook@chromium.org> CC: linux-kernel@vger.kernel.org Signed-off-by: NDenys Vlasenko <dvlasenk@redhat.com> Signed-off-by: NAndy Lutomirski <luto@amacapital.net>
-
- 14 12月, 2014 1 次提交
-
-
由 David Drysdale 提交于
Hook up x86-64, i386 and x32 ABIs. Signed-off-by: NDavid Drysdale <drysdale@google.com> Cc: Meredydd Luff <meredydd@senatehouse.org> Cc: Shuah Khan <shuah.kh@samsung.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Kees Cook <keescook@chromium.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Rich Felker <dalias@aerifal.cx> Cc: Christoph Hellwig <hch@infradead.org> Cc: Michael Kerrisk <mtk.manpages@gmail.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 20 11月, 2014 1 次提交
-
-
由 Al Viro 提交于
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
- 01 11月, 2014 1 次提交
-
-
由 Andy Lutomirski 提交于
Rusty noticed a Really Bad Bug (tm) in my NT fix. The entry code reads out of bounds, causing the NT fix to be unreliable. But, and this is much, much worse, if your stack is somehow just below the top of the direct map (or a hole), you read out of bounds and crash. Excerpt from the crash: [ 1.129513] RSP: 0018:ffff88001da4bf88 EFLAGS: 00010296 2b:* f7 84 24 90 00 00 00 testl $0x4000,0x90(%rsp) That read is deterministically above the top of the stack. I thought I even single-stepped through this code when I wrote it to check the offset, but I clearly screwed it up. Fixes: 8c7aa698 ("x86_64, entry: Filter RFLAGS.NT on entry from userspace") Reported-by: NRusty Russell <rusty@ozlabs.org> Cc: stable@vger.kernel.org Signed-off-by: NAndy Lutomirski <luto@amacapital.net> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 09 10月, 2014 1 次提交
-
-
由 Al Viro 提交于
... rather than doing that in the guts of ->load_binary(). [updated to fix the bug spotted by Shentino - for SIGSEGV we really need something stronger than send_sig_info(); again, better do that in one place] Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
- 07 10月, 2014 1 次提交
-
-
由 Andy Lutomirski 提交于
The NT flag doesn't do anything in long mode other than causing IRET to #GP. Oddly, CPL3 code can still set NT using popf. Entry via hardware or software interrupt clears NT automatically, so the only relevant entries are fast syscalls. If user code causes kernel code to run with NT set, then there's at least some (small) chance that it could cause trouble. For example, user code could cause a call to EFI code with NT set, and who knows what would happen? Apparently some games on Wine sometimes do this (!), and, if an IRET return happens, they will segfault. That segfault cannot be handled, because signal delivery fails, too. This patch programs the CPU to clear NT on entry via SYSCALL (both 32-bit and 64-bit, by my reading of the AMD APM), and it clears NT in software on entry via SYSENTER. To save a few cycles, this borrows a trick from Jan Beulich in Xen: it checks whether NT is set before trying to clear it. As a result, it seems to have very little effect on SYSENTER performance on my machine. There's another minor bug fix in here: it looks like the CFI annotations were wrong if CONFIG_AUDITSYSCALL=n. Testers beware: on Xen, SYSENTER with NT set turns into a GPF. I haven't touched anything on 32-bit kernels. The syscall mask change comes from a variant of this patch by Anish Bhatt. Note to stable maintainers: there is no known security issue here. A misguided program can set NT and cause the kernel to try and fail to deliver SIGSEGV, crashing the program. This patch fixes Far Cry on Wine: https://bugs.winehq.org/show_bug.cgi?id=33275 Cc: <stable@vger.kernel.org> Reported-by: NAnish Bhatt <anish@chelsio.com> Signed-off-by: NAndy Lutomirski <luto@amacapital.net> Link: http://lkml.kernel.org/r/395749a5d39a29bd3e4b35899cf3a3c1340e5595.1412189265.git.luto@amacapital.netSigned-off-by: NH. Peter Anvin <hpa@zytor.com>
-
- 24 9月, 2014 1 次提交
-
-
由 Richard Guy Briggs 提交于
Since the arch is found locally in __audit_syscall_entry(), there is no need to pass it in as a parameter. Delete it from the parameter list. x86* was the only arch to call __audit_syscall_entry() directly and did so from assembly code. Signed-off-by: NRichard Guy Briggs <rgb@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: x86@kernel.org Cc: linux-kernel@vger.kernel.org Cc: linux-audit@redhat.com Signed-off-by: NEric Paris <eparis@redhat.com> --- As this patch relies on changes in the audit tree, I think it appropriate to send it through my tree rather than the x86 tree.
-
- 06 5月, 2014 1 次提交
-
-
由 Andy Lutomirski 提交于
Currently, vdso.so files are prepared and analyzed by a combination of objcopy, nm, some linker script tricks, and some simple ELF parsers in the kernel. Replace all of that with plain C code that runs at build time. All five vdso images now generate .c files that are compiled and linked in to the kernel image. This should cause only one userspace-visible change: the loaded vDSO images are stripped more heavily than they used to be. Everything outside the loadable segment is dropped. In particular, this causes the section table and section name strings to be missing. This should be fine: real dynamic loaders don't load or inspect these tables anyway. The result is roughly equivalent to eu-strip's --strip-sections option. The purpose of this change is to enable the vvar and hpet mappings to be moved to the page following the vDSO load segment. Currently, it is possible for the section table to extend into the page after the load segment, so, if we map it, it risks overlapping the vvar or hpet page. This happens whenever the load segment is just under a multiple of PAGE_SIZE. The only real subtlety here is that the old code had a C file with inline assembler that did 'call VDSO32_vsyscall' and a linker script that defined 'VDSO32_vsyscall = __kernel_vsyscall'. This most likely worked by accident: the linker script entry defines a symbol associated with an address as opposed to an alias for the real dynamic symbol __kernel_vsyscall. That caused ld to relocate the reference at link time instead of leaving an interposable dynamic relocation. Since the VDSO32_vsyscall hack is no longer needed, I now use 'call __kernel_vsyscall', and I added -Bsymbolic to make it work. vdso2c will generate an error and abort the build if the resulting image contains any dynamic relocations, so we won't silently generate bad vdso images. (Dynamic relocations are a problem because nothing will even attempt to relocate the vdso.) Signed-off-by: NAndy Lutomirski <luto@amacapital.net> Link: http://lkml.kernel.org/r/2c4fcf45524162a34d87fdda1eb046b2a5cecee7.1399317206.git.luto@amacapital.netSigned-off-by: NH. Peter Anvin <hpa@linux.intel.com>
-
- 09 11月, 2013 4 次提交
-
-
由 Al Viro 提交于
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
由 Al Viro 提交于
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
由 Al Viro 提交于
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
由 Al Viro 提交于
just getting rid of bitrot Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
- 02 9月, 2013 1 次提交
-
-
由 Al Viro 提交于
For performance reasons, when SMAP is in use, SMAP is left open for an entire put_user_try { ... } put_user_catch(); block, however, calling __put_user() in the middle of that block will close SMAP as the STAC..CLAC constructs intentionally do not nest. Furthermore, using __put_user() rather than put_user_ex() here is bad for performance. Thus, introduce new [compat_]save_altstack_ex() helpers that replace __[compat_]save_altstack() for x86, being currently the only architecture which supports put_user_try { ... } put_user_catch(). Reported-by: NH. Peter Anvin <hpa@linux.intel.com> Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk> Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com> Cc: <stable@vger.kernel.org> # v3.8+ Link: http://lkml.kernel.org/n/tip-es5p6y64if71k8p5u08agv9n@git.kernel.org
-
- 23 7月, 2013 1 次提交
-
-
由 Ramkumar Ramachandra 提交于
Commit 3fe26fa3 ("x86: get rid of pt_regs argument in sigreturn variants", from 2012-11-12) changed the body of PTREGSCALL to drop arg, and updated the callsites; unfortunately, it forgot to update the macro argument list, leaving an unused argument. Fix this. Signed-off-by: NRamkumar Ramachandra <artagnon@gmail.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Link: http://lkml.kernel.org/r/1373479468-7175-1-git-send-email-artagnon@gmail.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
- 11 7月, 2013 1 次提交
-
-
由 Michel Lespinasse 提交于
Since all architectures have been converted to use vm_unmapped_area(), there is no remaining use for the free_area_cache. Signed-off-by: NMichel Lespinasse <walken@google.com> Acked-by: NRik van Riel <riel@redhat.com> Cc: "James E.J. Bottomley" <jejb@parisc-linux.org> Cc: "Luck, Tony" <tony.luck@intel.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: David Howells <dhowells@redhat.com> Cc: Helge Deller <deller@gmx.de> Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Cc: Matt Turner <mattst88@gmail.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Richard Henderson <rth@twiddle.net> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 22 6月, 2013 1 次提交
-
-
由 Al Viro 提交于
dump_seek() does SEEK_CUR, not SEEK_SET; native binfmt_aout handles it correctly (seeks by PAGE_SIZE - sizeof(struct user), getting the current position to PAGE_SIZE), compat one seeks by PAGE_SIZE and ends up at PAGE_SIZE + already written... Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-