- 16 7月, 2015 1 次提交
-
-
由 Tycho Andersen 提交于
This patch is the first step in enabling checkpoint/restore of processes with seccomp enabled. One of the things CRIU does while dumping tasks is inject code into them via ptrace to collect information that is only available to the process itself. However, if we are in a seccomp mode where these processes are prohibited from making these syscalls, then what CRIU does kills the task. This patch adds a new ptrace option, PTRACE_O_SUSPEND_SECCOMP, that enables a task from the init user namespace which has CAP_SYS_ADMIN and no seccomp filters to disable (and re-enable) seccomp filters for another task so that they can be successfully dumped (and restored). We restrict the set of processes that can disable seccomp through ptrace because although today ptrace can be used to bypass seccomp, there is some discussion of closing this loophole in the future and we would like this patch to not depend on that behavior and be future proofed for when it is removed. Note that seccomp can be suspended before any filters are actually installed; this behavior is useful on criu restore, so that we can suspend seccomp, restore the filters, unmap our restore code from the restored process' address space, and then resume the task by detaching and have the filters resumed as well. v2 changes: * require that the tracer have no seccomp filters installed * drop TIF_NOTSC manipulation from the patch * change from ptrace command to a ptrace option and use this ptrace option as the flag to check. This means that as soon as the tracer detaches/dies, seccomp is re-enabled and as a corrollary that one can not disable seccomp across PTRACE_ATTACHs. v3 changes: * get rid of various #ifdefs everywhere * report more sensible errors when PTRACE_O_SUSPEND_SECCOMP is incorrectly used v4 changes: * get rid of may_suspend_seccomp() in favor of a capable() check in ptrace directly v5 changes: * check that seccomp is not enabled (or suspended) on the tracer Signed-off-by: NTycho Andersen <tycho.andersen@canonical.com> CC: Will Drewry <wad@chromium.org> CC: Roland McGrath <roland@hack.frob.com> CC: Pavel Emelyanov <xemul@parallels.com> CC: Serge E. Hallyn <serge.hallyn@ubuntu.com> Acked-by: NOleg Nesterov <oleg@redhat.com> Acked-by: NAndy Lutomirski <luto@amacapital.net> [kees: access seccomp.mode through seccomp_mode() instead] Signed-off-by: NKees Cook <keescook@chromium.org>
-
- 17 4月, 2015 2 次提交
-
-
由 Oleg Nesterov 提交于
ptrace_detach() re-checks ->ptrace under tasklist lock and calls release_task() if __ptrace_detach() returns true. This was needed because the __TASK_TRACED tracee could be killed/untraced, and it could even pass exit_notify() before we take tasklist_lock. But this is no longer possible after 9899d11f "ptrace: ensure arch_ptrace/ptrace_request can never race with SIGKILL". We can turn these checks into WARN_ON() and remove release_task(). While at it, document the setting of child->exit_code. Signed-off-by: NOleg Nesterov <oleg@redhat.com> Cc: Pavel Labath <labath@google.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Oleg Nesterov 提交于
ptrace_resume() is called when the tracee is still __TASK_TRACED. We set tracee->exit_code and then wake_up_state() changes tracee->state. If the tracer's sub-thread does wait() in between, task_stopped_code(ptrace => T) wrongly looks like another report from tracee. This confuses debugger, and since wait_task_stopped() clears ->exit_code the tracee can miss a signal. Test-case: #include <stdio.h> #include <unistd.h> #include <sys/wait.h> #include <sys/ptrace.h> #include <pthread.h> #include <assert.h> int pid; void *waiter(void *arg) { int stat; for (;;) { assert(pid == wait(&stat)); assert(WIFSTOPPED(stat)); if (WSTOPSIG(stat) == SIGHUP) continue; assert(WSTOPSIG(stat) == SIGCONT); printf("ERR! extra/wrong report:%x\n", stat); } } int main(void) { pthread_t thread; pid = fork(); if (!pid) { assert(ptrace(PTRACE_TRACEME, 0,0,0) == 0); for (;;) kill(getpid(), SIGHUP); } assert(pthread_create(&thread, NULL, waiter, NULL) == 0); for (;;) ptrace(PTRACE_CONT, pid, 0, SIGCONT); return 0; } Note for stable: the bug is very old, but without 9899d11f "ptrace: ensure arch_ptrace/ptrace_request can never race with SIGKILL" the fix should use lock_task_sighand(child). Signed-off-by: NOleg Nesterov <oleg@redhat.com> Reported-by: NPavel Labath <labath@google.com> Tested-by: NPavel Labath <labath@google.com> Cc: <stable@vger.kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 18 2月, 2015 1 次提交
-
-
由 Fabian Frederick 提交于
Commit 84c751bd ("ptrace: add ability to retrieve signals without removing from a queue (v4)") includes <linux/compat.h> globally in ptrace.c This patch removes inclusion under if defined CONFIG_COMPAT. Signed-off-by: NFabian Frederick <fabf@skynet.be> Acked-by: NOleg Nesterov <oleg@redhat.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 11 12月, 2014 1 次提交
-
-
由 Oleg Nesterov 提交于
Now that forget_original_parent() uses ->ptrace_entry for EXIT_DEAD tasks, we can simply pass "dead_children" list to exit_ptrace() and remove another release_task() loop. Plus this way we do not need to drop and reacquire tasklist_lock. Also shift the list_empty(ptraced) check, if we want this optimization it makes sense to eliminate the function call altogether. Signed-off-by: NOleg Nesterov <oleg@redhat.com> Cc: Aaron Tomlin <atomlin@redhat.com> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com>, Cc: Sterling Alexander <stalexan@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Roland McGrath <roland@hack.frob.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 16 7月, 2014 1 次提交
-
-
由 NeilBrown 提交于
The current "wait_on_bit" interface requires an 'action' function to be provided which does the actual waiting. There are over 20 such functions, many of them identical. Most cases can be satisfied by one of just two functions, one which uses io_schedule() and one which just uses schedule(). So: Rename wait_on_bit and wait_on_bit_lock to wait_on_bit_action and wait_on_bit_lock_action to make it explicit that they need an action function. Introduce new wait_on_bit{,_lock} and wait_on_bit{,_lock}_io which are *not* given an action function but implicitly use a standard one. The decision to error-out if a signal is pending is now made based on the 'mode' argument rather than being encoded in the action function. All instances of the old wait_on_bit and wait_on_bit_lock which can use the new version have been changed accordingly and their action functions have been discarded. wait_on_bit{_lock} does not return any specific error code in the event of a signal so the caller must check for non-zero and interpolate their own error code as appropriate. The wait_on_bit() call in __fscache_wait_on_invalidate() was ambiguous as it specified TASK_UNINTERRUPTIBLE but used fscache_wait_bit_interruptible as an action function. David Howells confirms this should be uniformly "uninterruptible" The main remaining user of wait_on_bit{,_lock}_action is NFS which needs to use a freezer-aware schedule() call. A comment in fs/gfs2/glock.c notes that having multiple 'action' functions is useful as they display differently in the 'wchan' field of 'ps'. (and /proc/$PID/wchan). As the new bit_wait{,_io} functions are tagged "__sched", they will not show up at all, but something higher in the stack. So the distinction will still be visible, only with different function names (gds2_glock_wait versus gfs2_glock_dq_wait in the gfs2/glock.c case). Since first version of this patch (against 3.15) two new action functions appeared, on in NFS and one in CIFS. CIFS also now uses an action function that makes the same freezer aware schedule call as NFS. Signed-off-by: NNeilBrown <neilb@suse.de> Acked-by: David Howells <dhowells@redhat.com> (fscache, keys) Acked-by: Steven Whitehouse <swhiteho@redhat.com> (gfs2) Acked-by: NPeter Zijlstra <peterz@infradead.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Steve French <sfrench@samba.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: http://lkml.kernel.org/r/20140707051603.28027.72349.stgit@notabene.brownSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
- 06 3月, 2014 1 次提交
-
-
由 Heiko Carstens 提交于
Convert all compat system call functions where all parameter types have a size of four or less than four bytes, or are pointer types to COMPAT_SYSCALL_DEFINE. The implicit casts within COMPAT_SYSCALL_DEFINE will perform proper zero and sign extension to 64 bit of all parameters if needed. Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
-
- 13 11月, 2013 1 次提交
-
-
由 Kees Cook 提交于
The get_dumpable() return value is not boolean. Most users of the function actually want to be testing for non-SUID_DUMP_USER(1) rather than SUID_DUMP_DISABLE(0). The SUID_DUMP_ROOT(2) is also considered a protected state. Almost all places did this correctly, excepting the two places fixed in this patch. Wrong logic: if (dumpable == SUID_DUMP_DISABLE) { /* be protective */ } or if (dumpable == 0) { /* be protective */ } or if (!dumpable) { /* be protective */ } Correct logic: if (dumpable != SUID_DUMP_USER) { /* be protective */ } or if (dumpable != 1) { /* be protective */ } Without this patch, if the system had set the sysctl fs/suid_dumpable=2, a user was able to ptrace attach to processes that had dropped privileges to that user. (This may have been partially mitigated if Yama was enabled.) The macros have been moved into the file that declares get/set_dumpable(), which means things like the ia64 code can see them too. CVE-2013-2929 Reported-by: NVasily Kulikov <segoon@openwall.com> Signed-off-by: NKees Cook <keescook@chromium.org> Cc: "Luck, Tony" <tony.luck@intel.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: <stable@vger.kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 12 9月, 2013 1 次提交
-
-
由 Mark Grondona 提交于
__ptrace_may_access() checks get_dumpable/ptrace_has_cap/etc if task != current, this can can lead to surprising results. For example, a sub-thread can't readlink("/proc/self/exe") if the executable is not readable. setup_new_exec()->would_dump() notices that inode_permission(MAY_READ) fails and then it does set_dumpable(suid_dumpable). After that get_dumpable() fails. (It is not clear why proc_pid_readlink() checks get_dumpable(), perhaps we could add PTRACE_MODE_NODUMPABLE) Change __ptrace_may_access() to use same_thread_group() instead of "task == current". Any security check is pointless when the tasks share the same ->mm. Signed-off-by: NMark Grondona <mgrondona@llnl.gov> Signed-off-by: NBen Woodard <woodard@redhat.com> Signed-off-by: NOleg Nesterov <oleg@redhat.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 07 8月, 2013 1 次提交
-
-
由 Oleg Nesterov 提交于
This reverts commit fab840fc. This commit even has the test-case to prove that the tracee can be killed by SIGTRAP if the debugger does not remove the breakpoints before PTRACE_DETACH. However, this is exactly what wineserver deliberately does, set_thread_context() calls PTRACE_ATTACH + PTRACE_DETACH just for PTRACE_POKEUSER(DR*) in between. So we should revert this fix and document that PTRACE_DETACH should keep the breakpoints. Reported-by: NFelipe Contreras <felipe.contreras@gmail.com> Signed-off-by: NOleg Nesterov <oleg@redhat.com> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 10 7月, 2013 2 次提交
-
-
由 Oleg Nesterov 提交于
Change ptrace_detach() to call flush_ptrace_hw_breakpoint(child). This frees the slots for non-ptrace PERF_TYPE_BREAKPOINT users, and this ensures that the tracee won't be killed by SIGTRAP triggered by the active breakpoints. Test-case: unsigned long encode_dr7(int drnum, int enable, unsigned int type, unsigned int len) { unsigned long dr7; dr7 = ((len | type) & 0xf) << (DR_CONTROL_SHIFT + drnum * DR_CONTROL_SIZE); if (enable) dr7 |= (DR_GLOBAL_ENABLE << (drnum * DR_ENABLE_SIZE)); return dr7; } int write_dr(int pid, int dr, unsigned long val) { return ptrace(PTRACE_POKEUSER, pid, offsetof (struct user, u_debugreg[dr]), val); } void func(void) { } int main(void) { int pid, stat; unsigned long dr7; pid = fork(); if (!pid) { assert(ptrace(PTRACE_TRACEME, 0,0,0) == 0); kill(getpid(), SIGHUP); func(); return 0x13; } assert(pid == waitpid(-1, &stat, 0)); assert(WSTOPSIG(stat) == SIGHUP); assert(write_dr(pid, 0, (long)func) == 0); dr7 = encode_dr7(0, 1, DR_RW_EXECUTE, DR_LEN_1); assert(write_dr(pid, 7, dr7) == 0); assert(ptrace(PTRACE_DETACH, pid, 0,0) == 0); assert(pid == waitpid(-1, &stat, 0)); assert(stat == 0x1300); return 0; } Before this patch the child is killed after PTRACE_DETACH. Signed-off-by: NOleg Nesterov <oleg@redhat.com> Acked-by: NFrederic Weisbecker <fweisbec@gmail.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jan Kratochvil <jan.kratochvil@redhat.com> Cc: Michael Neuling <mikey@neuling.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Paul Mundt <lethal@linux-sh.org> Cc: Will Deacon <will.deacon@arm.com> Cc: Prasad <prasad@linux.vnet.ibm.com> Cc: Russell King <linux@arm.linux.org.uk> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Oleg Nesterov 提交于
This reverts commit bf26c018 ("Prepare to fix racy accesses on task breakpoints"). The patch was fine but we can no longer race with SIGKILL after commit 9899d11f ("ptrace: ensure arch_ptrace/ptrace_request can never race with SIGKILL"), the __TASK_TRACED tracee can't be woken up and ->ptrace_bps[] can't go away. Now that ptrace_get_breakpoints/ptrace_put_breakpoints have no callers, we can kill them and remove task->ptrace_bp_refcnt. Signed-off-by: NOleg Nesterov <oleg@redhat.com> Acked-by: NFrederic Weisbecker <fweisbec@gmail.com> Acked-by: NMichael Neuling <mikey@neuling.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jan Kratochvil <jan.kratochvil@redhat.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Paul Mundt <lethal@linux-sh.org> Cc: Will Deacon <will.deacon@arm.com> Cc: Prasad <prasad@linux.vnet.ibm.com> Cc: Russell King <linux@arm.linux.org.uk> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 04 7月, 2013 1 次提交
-
-
由 Andrey Vagin 提交于
crtools uses a parasite code for dumping processes. The parasite code is injected into a process with help PTRACE_SEIZE. Currently crtools blocks signals from a parasite code. If a process has pending signals, crtools wait while a process handles these signals. This method is not suitable for stopped tasks. A stopped task can have a few pending signals, when we will try to execute a parasite code, we will need to drop SIGSTOP, but all other signals must remain pending, because a state of processes must not be changed during checkpointing. This patch adds two ptrace commands to set/get signal-blocked mask. I think gdb can use this commands too. [akpm@linux-foundation.org: be consistent with brace layout] Signed-off-by: NAndrey Vagin <avagin@openvz.org> Reviewed-by: NOleg Nesterov <oleg@redhat.com> Cc: Roland McGrath <roland@redhat.com> Cc: Michael Kerrisk <mtk.manpages@gmail.com> Cc: Pavel Emelyanov <xemul@parallels.com> Cc: Cyrill Gorcunov <gorcunov@openvz.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 30 6月, 2013 1 次提交
-
-
由 Mathieu Desnoyers 提交于
This __put_user() could be used by unprivileged processes to write into kernel memory. The issue here is that even if copy_siginfo_to_user() fails, the error code is not checked before __put_user() is executed. Luckily, ptrace_peek_siginfo() has been added within the 3.10-rc cycle, so it has not hit a stable release yet. Signed-off-by: NMathieu Desnoyers <mathieu.desnoyers@efficios.com> Acked-by: NOleg Nesterov <oleg@redhat.com> Cc: Andrey Vagin <avagin@openvz.org> Cc: Roland McGrath <roland@redhat.com> Cc: Paul McKenney <paulmck@linux.vnet.ibm.com> Cc: David Howells <dhowells@redhat.com> Cc: Dave Jones <davej@redhat.com> Cc: Pavel Emelyanov <xemul@parallels.com> Cc: Pedro Alves <palves@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 08 5月, 2013 1 次提交
-
-
由 Kent Overstreet 提交于
Faster kernel compiles by way of fewer unnecessary includes. [akpm@linux-foundation.org: fix fallout] [akpm@linux-foundation.org: fix build] Signed-off-by: NKent Overstreet <koverstreet@google.com> Cc: Zach Brown <zab@redhat.com> Cc: Felipe Balbi <balbi@ti.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Mark Fasheh <mfasheh@suse.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Jens Axboe <axboe@kernel.dk> Cc: Asai Thambi S P <asamymuthupa@micron.com> Cc: Selvan Mani <smani@micron.com> Cc: Sam Bradshaw <sbradshaw@micron.com> Cc: Jeff Moyer <jmoyer@redhat.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Benjamin LaHaise <bcrl@kvack.org> Reviewed-by: N"Theodore Ts'o" <tytso@mit.edu> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 01 5月, 2013 1 次提交
-
-
由 Andrey Vagin 提交于
This patch adds a new ptrace request PTRACE_PEEKSIGINFO. This request is used to retrieve information about pending signals starting with the specified sequence number. Siginfo_t structures are copied from the child into the buffer starting at "data". The argument "addr" is a pointer to struct ptrace_peeksiginfo_args. struct ptrace_peeksiginfo_args { u64 off; /* from which siginfo to start */ u32 flags; s32 nr; /* how may siginfos to take */ }; "nr" has type "s32", because ptrace() returns "long", which has 32 bits on i386 and a negative values is used for errors. Currently here is only one flag PTRACE_PEEKSIGINFO_SHARED for dumping signals from process-wide queue. If this flag is not set, signals are read from a per-thread queue. The request PTRACE_PEEKSIGINFO returns a number of dumped signals. If a signal with the specified sequence number doesn't exist, ptrace returns zero. The request returns an error, if no signal has been dumped. Errors: EINVAL - one or more specified flags are not supported or nr is negative EFAULT - buf or addr is outside your accessible address space. A result siginfo contains a kernel part of si_code which usually striped, but it's required for queuing the same siginfo back during restore of pending signals. This functionality is required for checkpointing pending signals. Pedro Alves suggested using it in "gdb" to peek at pending signals. gdb already uses PTRACE_GETSIGINFO to get the siginfo for the signal which was already dequeued. This functionality allows gdb to look at the pending signals which were not reported yet. The prototype of this code was developed by Oleg Nesterov. Signed-off-by: NAndrew Vagin <avagin@openvz.org> Cc: Roland McGrath <roland@redhat.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> Cc: David Howells <dhowells@redhat.com> Cc: Dave Jones <davej@redhat.com> Cc: "Michael Kerrisk (man-pages)" <mtk.manpages@gmail.com> Cc: Pavel Emelyanov <xemul@parallels.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Pedro Alves <palves@redhat.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 09 2月, 2013 1 次提交
-
-
由 Josh Stone 提交于
The original pull message for uprobes (commit 654443e2) noted: This tree includes uprobes support in 'perf probe' - but SystemTap (and other tools) can take advantage of user probe points as well. In order to actually be usable in module-based tools like SystemTap, the interface needs to be exported. This patch first adds the obvious exports for uprobe_register and uprobe_unregister. Then it also adds one for task_user_regset_view, which is necessary to get the correct state of userspace registers. Signed-off-by: NJosh Stone <jistone@redhat.com> Signed-off-by: NOleg Nesterov <oleg@redhat.com>
-
- 23 1月, 2013 2 次提交
-
-
由 Oleg Nesterov 提交于
putreg() assumes that the tracee is not running and pt_regs_access() can safely play with its stack. However a killed tracee can return from ptrace_stop() to the low-level asm code and do RESTORE_REST, this means that debugger can actually read/modify the kernel stack until the tracee does SAVE_REST again. set_task_blockstep() can race with SIGKILL too and in some sense this race is even worse, the very fact the tracee can be woken up breaks the logic. As Linus suggested we can clear TASK_WAKEKILL around the arch_ptrace() call, this ensures that nobody can ever wakeup the tracee while the debugger looks at it. Not only this fixes the mentioned problems, we can do some cleanups/simplifications in arch_ptrace() paths. Probably ptrace_unfreeze_traced() needs more callers, for example it makes sense to make the tracee killable for oom-killer before access_process_vm(). While at it, add the comment into may_ptrace_stop() to explain why ptrace_stop() still can't rely on SIGKILL and signal_pending_state(). Reported-by: NSalman Qazi <sqazi@google.com> Reported-by: NSuleiman Souhlal <suleiman@google.com> Suggested-by: NLinus Torvalds <torvalds@linux-foundation.org> Signed-off-by: NOleg Nesterov <oleg@redhat.com> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Oleg Nesterov 提交于
Cleanup and preparation for the next change. signal_wake_up(resume => true) is overused. None of ptrace/jctl callers actually want to wakeup a TASK_WAKEKILL task, but they can't specify the necessary mask. Turn signal_wake_up() into signal_wake_up_state(state), reintroduce signal_wake_up() as a trivial helper, and add ptrace_signal_wake_up() which adds __TASK_TRACED. This way ptrace_signal_wake_up() can work "inside" ptrace_request() even if the tracee doesn't have the TASK_WAKEKILL bit set. Signed-off-by: NOleg Nesterov <oleg@redhat.com> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 21 1月, 2013 1 次提交
-
-
由 Oleg Nesterov 提交于
The ia64 function "thread_matches()" has no users since commit e868a55c ("[IA64] remove find_thread_for_addr()"). Remove it. This allows us to make ptrace_check_attach() static to kernel/ptrace.c, which is good since we'll need to change the semantics of it and fix up all the callers. Signed-off-by: NOleg Nesterov <oleg@redhat.com> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 18 12月, 2012 1 次提交
-
-
由 Oleg Nesterov 提交于
Ptrace jailers want to be sure that the tracee can never escape from the control. However if the tracer dies unexpectedly the tracee continues to run in potentially unsafe mode. Add the new ptrace option PTRACE_O_EXITKILL. If the tracer exits it sends SIGKILL to every tracee which has this bit set. Note that the new option is not equal to the last-option << 1. Because currently all options have an event, and the new one starts the eventless group. It uses the random 20 bit, so we have the room for 12 more events, but we can also add the new eventless options below this one. Suggested by Amnon Shiloh. Signed-off-by: NOleg Nesterov <oleg@redhat.com> Tested-by: NAmnon Shiloh <u3557@miso.sublimeip.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Michael Kerrisk <mtk.manpages@gmail.com> Cc: Serge Hallyn <serge.hallyn@canonical.com> Cc: Chris Evans <scarybeasts@gmail.com> Cc: David Howells <dhowells@redhat.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 20 11月, 2012 1 次提交
-
-
由 Eric W. Biederman 提交于
The task_user_ns function hides the fact that it is getting the user namespace from struct cred on the task. struct cred may go away as soon as the rcu lock is released. This leads to a race where we can dereference a stale user namespace pointer. To make it obvious a struct cred is involved kill task_user_ns. To kill the race modify the users of task_user_ns to only reference the user namespace while the rcu lock is held. Cc: Kees Cook <keescook@chromium.org> Cc: James Morris <james.l.morris@oracle.com> Acked-by: NKees Cook <keescook@chromium.org> Acked-by: NSerge Hallyn <serge.hallyn@canonical.com> Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>
-
- 03 8月, 2012 1 次提交
-
-
由 Tetsuo Handa 提交于
__ptrace_may_access() is used within only kernel/ptrace.c. Signed-off-by: NTetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Signed-off-by: NJames Morris <james.l.morris@oracle.com>
-
- 03 5月, 2012 1 次提交
-
-
由 Eric W. Biederman 提交于
Update the permission checks to use the new uid_eq and gid_eq helpers and remove the now unnecessary user_ns equality comparison. Acked-by: NSerge Hallyn <serge.hallyn@canonical.com> Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
-
- 08 4月, 2012 1 次提交
-
-
由 Eric W. Biederman 提交于
Optimize performance and prepare for the removal of the user_ns reference from user_struct. Remove the slow long walk through cred->user->user_ns and instead go straight to cred->user_ns. Acked-by: NSerge Hallyn <serge.hallyn@canonical.com> Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
-
- 24 3月, 2012 4 次提交
-
-
由 Denys Vlasenko 提交于
PTRACE_SEIZE code is tested and ready for production use, remove the code which requires special bit in data argument to make PTRACE_SEIZE work. Strace team prepares for a new release of strace, and we would like to ship the code which uses PTRACE_SEIZE, preferably after this change goes into released kernel. Signed-off-by: NDenys Vlasenko <vda.linux@googlemail.com> Acked-by: NTejun Heo <tj@kernel.org> Acked-by: NOleg Nesterov <oleg@redhat.com> Cc: Pedro Alves <palves@redhat.com> Cc: Jan Kratochvil <jan.kratochvil@redhat.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Denys Vlasenko 提交于
This can be used to close a few corner cases in strace where we get unwanted racy behavior after attach, but before we have a chance to set options (the notorious post-execve SIGTRAP comes to mind), and removes the need to track "did we set opts for this task" state in strace internals. While we are at it: Make it possible to extend SEIZE in the future with more functionality by passing non-zero 'addr' parameter. To that end, error out if 'addr' is non-zero. PTRACE_ATTACH did not (and still does not) have such check, and users (strace) do pass garbage there... let's avoid repeating this mistake with SEIZE. Set all task->ptrace bits in one operation - before this change, we were adding PT_SEIZED and PT_PTRACE_CAP with task->ptrace |= BIT ops. This was probably ok (not a bug), but let's be on a safer side. Changes since v2: use (unsigned long) casts instead of (long) ones, move PTRACE_SEIZE_DEVEL-related code to separate lines of code. Signed-off-by: NDenys Vlasenko <vda.linux@googlemail.com> Acked-by: NTejun Heo <tj@kernel.org> Cc: Pedro Alves <palves@redhat.com> Reviewed-by: NOleg Nesterov <oleg@redhat.com> Cc: Jan Kratochvil <jan.kratochvil@redhat.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Denys Vlasenko 提交于
Exchange PT_TRACESYSGOOD and PT_PTRACE_CAP bit positions, which makes PT_option bits contiguous and therefore makes code in ptrace_setoptions() much simpler. Every PTRACE_O_TRACEevent is defined to (1 << PTRACE_EVENT_event) instead of using explicit numeric constants, to ensure we don't mess up relationship between bit positions and event ids. PT_EVENT_FLAG_SHIFT was not particularly useful, PT_OPT_FLAG_SHIFT with value of PT_EVENT_FLAG_SHIFT-1 is easier to use. PT_TRACE_MASK constant is nuked, the only its use is replaced by (PTRACE_O_MASK << PT_OPT_FLAG_SHIFT). Signed-off-by: NDenys Vlasenko <vda.linux@googlemail.com> Acked-by: NTejun Heo <tj@kernel.org> Reviewed-by: NOleg Nesterov <oleg@redhat.com> Cc: Pedro Alves <palves@redhat.com> Cc: Jan Kratochvil <jan.kratochvil@redhat.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Denys Vlasenko 提交于
On ptrace(PTRACE_SETOPTIONS, pid, 0, <opts>), we used to set those option bits which are known, and then fail with -EINVAL if there are some unknown bits in <opts>. This is inconsistent with typical error handling, which does not change any state if input is invalid. This patch changes PTRACE_SETOPTIONS behavior so that in this case, we return -EINVAL and don't change any bits in task->ptrace. It's very unlikely that there is userspace code in the wild which will be affected by this change: it should have the form ptrace(PTRACE_SETOPTIONS, pid, 0, PTRACE_O_BOGUSOPT) where PTRACE_O_BOGUSOPT is a constant unknown to the kernel. But kernel headers, naturally, don't contain any PTRACE_O_BOGUSOPTs, thus the only way userspace can use one if it defines one itself. I can't see why anyone would do such a thing deliberately. Signed-off-by: NDenys Vlasenko <vda.linux@googlemail.com> Acked-by: NTejun Heo <tj@kernel.org> Reviewed-by: NOleg Nesterov <oleg@redhat.com> Cc: Pedro Alves <palves@redhat.com> Cc: Jan Kratochvil <jan.kratochvil@redhat.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 06 1月, 2012 2 次提交
-
-
由 Eric Paris 提交于
Reading /proc/pid/stat of another process checks if one has ptrace permissions on that process. If one does have permissions it outputs some data about the process which might have security and attack implications. If the current task does not have ptrace permissions the read still works, but those fields are filled with inocuous (0) values. Since this check and a subsequent denial is not a violation of the security policy we should not audit such denials. This can be quite useful to removing ptrace broadly across a system without flooding the logs when ps is run or something which harmlessly walks proc. Signed-off-by: NEric Paris <eparis@redhat.com> Acked-by: NSerge E. Hallyn <serge.hallyn@canonical.com>
-
由 Eric Paris 提交于
task_ in the front of a function, in the security subsystem anyway, means to me at least, that we are operating with that task as the subject of the security decision. In this case what it means is that we are using current as the subject but we use the task to get the right namespace. Who in the world would ever realize that's what task_ns_capability means just by the name? This patch eliminates the task_ns functions entirely and uses the has_ns_capability function instead. This means we explicitly open code the ns in question in the caller. I think it makes the caller a LOT more clear what is going on. Signed-off-by: NEric Paris <eparis@redhat.com> Acked-by: NSerge E. Hallyn <serge.hallyn@canonical.com>
-
- 05 1月, 2012 1 次提交
-
-
由 Oleg Nesterov 提交于
This is the temporary simple fix for 3.2, we need more changes in this area. 1. do_signal_stop() assumes that the running untraced thread in the stopped thread group is not possible. This was our goal but it is not yet achieved: a stopped-but-resumed tracee can clone the running thread which can initiate another group-stop. Remove WARN_ON_ONCE(!current->ptrace). 2. A new thread always starts with ->jobctl = 0. If it is auto-attached and this group is stopped, __ptrace_unlink() sets JOBCTL_STOP_PENDING but JOBCTL_STOP_SIGMASK part is zero, this triggers WANR_ON(!signr) in do_jobctl_trap() if another debugger attaches. Change __ptrace_unlink() to set the artificial SIGSTOP for report. Alternatively we could change ptrace_init_task() to copy signr from current, but this means we can copy it for no reason and hide the possible similar problems. Acked-by: NTejun Heo <tj@kernel.org> Cc: <stable@kernel.org> [3.1] Signed-off-by: NOleg Nesterov <oleg@redhat.com> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 31 10月, 2011 1 次提交
-
-
由 Paul Gortmaker 提交于
The changed files were only including linux/module.h for the EXPORT_SYMBOL infrastructure, and nothing else. Revector them onto the isolated export header for faster compile times. Nothing to see here but a whole lot of instances of: -#include <linux/module.h> +#include <linux/export.h> This commit is only changing the kernel dir; next targets will probably be mm, fs, the arch dirs, etc. Signed-off-by: NPaul Gortmaker <paul.gortmaker@windriver.com>
-
- 26 9月, 2011 1 次提交
-
-
由 Oleg Nesterov 提交于
If PTRACE_LISTEN fails after lock_task_sighand() it doesn't drop ->siglock. Reported-by: NMatt Fleming <matt.fleming@intel.com> Signed-off-by: NOleg Nesterov <oleg@redhat.com> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 19 7月, 2011 1 次提交
-
-
由 Vladimir Zapolskiy 提交于
This change adds a procfs connector event, which is emitted on every successful process tracer attach or detach. If some process connects to other one, kernelspace connector reports process id and thread group id of both these involved processes. On disconnection null process id is returned. Such an event allows to create a simple automated userspace mechanism to be aware about processes connecting to others, therefore predefined process policies can be applied to them if needed. Note, a detach signal is emitted only in case, if a tracer process explicitly executes PTRACE_DETACH request. In other cases like tracee or tracer exit detach event from proc connector is not reported. Signed-off-by: NVladimir Zapolskiy <vzapolskiy@gmail.com> Acked-by: NEvgeniy Polyakov <zbr@ioremap.net> Cc: David S. Miller <davem@davemloft.net> Signed-off-by: NOleg Nesterov <oleg@redhat.com>
-
- 28 6月, 2011 2 次提交
-
-
由 Oleg Nesterov 提交于
__ptrace_detach() and do_notify_parent() set task->exit_signal = -1 to mark the task dead. This is no longer needed, nobody checks exit_signal to detect the EXIT_DEAD task. Signed-off-by: NOleg Nesterov <oleg@redhat.com> Reviewed-by: NTejun Heo <tj@kernel.org>
-
由 Oleg Nesterov 提交于
__ptrace_detach() relies on the current obscure behaviour of do_notify_parent(tsk) which changes tsk->exit_signal if this child should be silently reaped. That is why we check task_detached(), it is true if the task is sub-thread, or it is the group_leader but its exit_signal was changed by do_notify_parent(). This is confusing, change the code to rely on !thread_group_leader() or the value returned by do_notify_parent(). Signed-off-by: NOleg Nesterov <oleg@redhat.com> Acked-by: NTejun Heo <tj@kernel.org>
-
- 17 6月, 2011 3 次提交
-
-
由 Tejun Heo 提交于
The previous patch implemented async notification for ptrace but it only worked while trace is running. This patch introduces PTRACE_LISTEN which is suggested by Oleg Nestrov. It's allowed iff tracee is in STOP trap and puts tracee into quasi-running state - tracee never really runs but wait(2) and ptrace(2) consider it to be running. While ptracer is listening, tracee is allowed to re-enter STOP to notify an async event. Listening state is cleared on the first notification. Ptracer can also clear it by issuing INTERRUPT - tracee will re-trap into STOP with listening state cleared. This allows ptracer to monitor group stop state without running tracee - use INTERRUPT to put tracee into STOP trap, issue LISTEN and then wait(2) to wait for the next group stop event. When it happens, PTRACE_GETSIGINFO provides information to determine the current state. Test program follows. #define PTRACE_SEIZE 0x4206 #define PTRACE_INTERRUPT 0x4207 #define PTRACE_LISTEN 0x4208 #define PTRACE_SEIZE_DEVEL 0x80000000 static const struct timespec ts1s = { .tv_sec = 1 }; int main(int argc, char **argv) { pid_t tracee, tracer; int i; tracee = fork(); if (!tracee) while (1) pause(); tracer = fork(); if (!tracer) { siginfo_t si; ptrace(PTRACE_SEIZE, tracee, NULL, (void *)(unsigned long)PTRACE_SEIZE_DEVEL); ptrace(PTRACE_INTERRUPT, tracee, NULL, NULL); repeat: waitid(P_PID, tracee, NULL, WSTOPPED); ptrace(PTRACE_GETSIGINFO, tracee, NULL, &si); if (!si.si_code) { printf("tracer: SIG %d\n", si.si_signo); ptrace(PTRACE_CONT, tracee, NULL, (void *)(unsigned long)si.si_signo); goto repeat; } printf("tracer: stopped=%d signo=%d\n", si.si_signo != SIGTRAP, si.si_signo); if (si.si_signo != SIGTRAP) ptrace(PTRACE_LISTEN, tracee, NULL, NULL); else ptrace(PTRACE_CONT, tracee, NULL, NULL); goto repeat; } for (i = 0; i < 3; i++) { nanosleep(&ts1s, NULL); printf("mother: SIGSTOP\n"); kill(tracee, SIGSTOP); nanosleep(&ts1s, NULL); printf("mother: SIGCONT\n"); kill(tracee, SIGCONT); } nanosleep(&ts1s, NULL); kill(tracer, SIGKILL); kill(tracee, SIGKILL); return 0; } This is identical to the program to test TRAP_NOTIFY except that tracee is PTRACE_LISTEN'd instead of PTRACE_CONT'd when group stopped. This allows ptracer to monitor when group stop ends without running tracee. # ./test-listen tracer: stopped=0 signo=5 mother: SIGSTOP tracer: SIG 19 tracer: stopped=1 signo=19 mother: SIGCONT tracer: stopped=0 signo=5 tracer: SIG 18 mother: SIGSTOP tracer: SIG 19 tracer: stopped=1 signo=19 mother: SIGCONT tracer: stopped=0 signo=5 tracer: SIG 18 mother: SIGSTOP tracer: SIG 19 tracer: stopped=1 signo=19 mother: SIGCONT tracer: stopped=0 signo=5 tracer: SIG 18 -v2: Moved JOBCTL_LISTENING check in wait_task_stopped() into task_stopped_code() as suggested by Oleg. Signed-off-by: NTejun Heo <tj@kernel.org> Cc: Oleg Nesterov <oleg@redhat.com>
-
由 Tejun Heo 提交于
Currently, there's no way to trap a running ptracee short of sending a signal which has various side effects. This patch implements PTRACE_INTERRUPT which traps ptracee without any signal or job control related side effect. The implementation is almost trivial. It uses the group stop trap - SIGTRAP | PTRACE_EVENT_STOP << 8. A new trap flag JOBCTL_TRAP_INTERRUPT is added, which is set on PTRACE_INTERRUPT and cleared when any trap happens. As INTERRUPT should be useable regardless of the current state of tracee, task_is_traced() test in ptrace_check_attach() is skipped for INTERRUPT. PTRACE_INTERRUPT is available iff tracee is attached with PTRACE_SEIZE. Test program follows. #define PTRACE_SEIZE 0x4206 #define PTRACE_INTERRUPT 0x4207 #define PTRACE_SEIZE_DEVEL 0x80000000 static const struct timespec ts100ms = { .tv_nsec = 100000000 }; static const struct timespec ts1s = { .tv_sec = 1 }; static const struct timespec ts3s = { .tv_sec = 3 }; int main(int argc, char **argv) { pid_t tracee; tracee = fork(); if (tracee == 0) { nanosleep(&ts100ms, NULL); while (1) { printf("tracee: alive pid=%d\n", getpid()); nanosleep(&ts1s, NULL); } } if (argc > 1) kill(tracee, SIGSTOP); nanosleep(&ts100ms, NULL); ptrace(PTRACE_SEIZE, tracee, NULL, (void *)(unsigned long)PTRACE_SEIZE_DEVEL); if (argc > 1) { waitid(P_PID, tracee, NULL, WSTOPPED); ptrace(PTRACE_CONT, tracee, NULL, NULL); } nanosleep(&ts3s, NULL); printf("tracer: INTERRUPT and DETACH\n"); ptrace(PTRACE_INTERRUPT, tracee, NULL, NULL); waitid(P_PID, tracee, NULL, WSTOPPED); ptrace(PTRACE_DETACH, tracee, NULL, NULL); nanosleep(&ts3s, NULL); printf("tracer: exiting\n"); kill(tracee, SIGKILL); return 0; } When called without argument, tracee is seized from running state, interrupted and then detached back to running state. # ./test-interrupt tracee: alive pid=4546 tracee: alive pid=4546 tracee: alive pid=4546 tracer: INTERRUPT and DETACH tracee: alive pid=4546 tracee: alive pid=4546 tracee: alive pid=4546 tracer: exiting When called with argument, tracee is seized from stopped state, continued, interrupted and then detached back to stopped state. # ./test-interrupt 1 tracee: alive pid=4548 tracee: alive pid=4548 tracee: alive pid=4548 tracer: INTERRUPT and DETACH tracer: exiting Before PTRACE_INTERRUPT, once the tracee was running, there was no way to trap tracee and do PTRACE_DETACH without causing side effect. -v2: Updated to use task_set_jobctl_pending() so that it doesn't end up scheduling TRAP_STOP if child is dying which may make the child unkillable. Spotted by Oleg. Signed-off-by: NTejun Heo <tj@kernel.org> Cc: Oleg Nesterov <oleg@redhat.com>
-
由 Tejun Heo 提交于
PTRACE_ATTACH implicitly issues SIGSTOP on attach which has side effects on tracee signal and job control states. This patch implements a new ptrace request PTRACE_SEIZE which attaches a tracee without trapping it or affecting its signal and job control states. The usage is the same with PTRACE_ATTACH but it takes PTRACE_SEIZE_* flags in @data. Currently, the only defined flag is PTRACE_SEIZE_DEVEL which is a temporary flag to enable PTRACE_SEIZE. PTRACE_SEIZE will change ptrace behaviors outside of attach itself. The changes will be implemented gradually and the DEVEL flag is to prevent programs which expect full SEIZE behavior from using it before all the behavior modifications are complete while allowing unit testing. The flag will be removed once SEIZE behaviors are completely implemented. * PTRACE_SEIZE, unlike ATTACH, doesn't force tracee to trap. After attaching tracee continues to run unless a trap condition occurs. * PTRACE_SEIZE doesn't affect signal or group stop state. * If PTRACE_SEIZE'd, group stop uses PTRACE_EVENT_STOP trap which uses exit_code of (signr | PTRACE_EVENT_STOP << 8) where signr is one of the stopping signals if group stop is in effect or SIGTRAP otherwise, and returns usual trap siginfo on PTRACE_GETSIGINFO instead of NULL. Seizing sets PT_SEIZED in ->ptrace of the tracee. This flag will be used to determine whether new SEIZE behaviors should be enabled. Test program follows. #define PTRACE_SEIZE 0x4206 #define PTRACE_SEIZE_DEVEL 0x80000000 static const struct timespec ts100ms = { .tv_nsec = 100000000 }; static const struct timespec ts1s = { .tv_sec = 1 }; static const struct timespec ts3s = { .tv_sec = 3 }; int main(int argc, char **argv) { pid_t tracee; tracee = fork(); if (tracee == 0) { nanosleep(&ts100ms, NULL); while (1) { printf("tracee: alive\n"); nanosleep(&ts1s, NULL); } } if (argc > 1) kill(tracee, SIGSTOP); nanosleep(&ts100ms, NULL); ptrace(PTRACE_SEIZE, tracee, NULL, (void *)(unsigned long)PTRACE_SEIZE_DEVEL); if (argc > 1) { waitid(P_PID, tracee, NULL, WSTOPPED); ptrace(PTRACE_CONT, tracee, NULL, NULL); } nanosleep(&ts3s, NULL); printf("tracer: exiting\n"); return 0; } When the above program is called w/o argument, tracee is seized while running and remains running. When tracer exits, tracee continues to run and print out messages. # ./test-seize-simple tracee: alive tracee: alive tracee: alive tracer: exiting tracee: alive tracee: alive When called with an argument, tracee is seized from stopped state and continued, and returns to stopped state when tracer exits. # ./test-seize tracee: alive tracee: alive tracee: alive tracer: exiting # ps -el|grep test-seize 1 T 0 4720 1 0 80 0 - 941 signal ttyS0 00:00:00 test-seize -v2: SEIZE doesn't schedule TRAP_STOP and leaves tracee running as Jan suggested. -v3: PTRACE_EVENT_STOP traps now report group stop state by signr. If group stop is in effect the stop signal number is returned as part of exit_code; otherwise, SIGTRAP. This was suggested by Denys and Oleg. Signed-off-by: NTejun Heo <tj@kernel.org> Cc: Jan Kratochvil <jan.kratochvil@redhat.com> Cc: Denys Vlasenko <vda.linux@googlemail.com> Cc: Oleg Nesterov <oleg@redhat.com>
-