提交 · 0666fb51b1483f27506e212cc7f7b2645b5c7acc · gsplhtlxg / clone-Linux

26 5月, 2011 1 次提交

ptrace: ptrace_resume() shouldn't wake up !TASK_TRACED thread · 0666fb51

由 Oleg Nesterov 提交于 5月 25, 2011

It is not clear why ptrace_resume() does wake_up_process(). Unless the
caller is PTRACE_KILL the tracee should be TASK_TRACED so we can use
wake_up_state(__TASK_TRACED). If sys_ptrace() races with SIGKILL we do
not need the extra and potentionally spurious wakeup.

If the caller is PTRACE_KILL, wake_up_process() is even more wrong.
The tracee can sleep in any state in any place, and if we have a buggy
code which doesn't handle a spurious wakeup correctly PTRACE_KILL can
be used to exploit it. For example:

	int main(void)
	{
		int child, status;

		child = fork();
		if (!child) {
			int ret;

			assert(ptrace(PTRACE_TRACEME, 0,0,0) == 0);

			ret = pause();
			printf("pause: %d %m\n", ret);

			return 0x23;
		}

		sleep(1);
		assert(ptrace(PTRACE_KILL, child, 0,0) == 0);

		assert(child == wait(&status));
		printf("wait: %x\n", status);

		return 0;
	}

prints "pause: -1 Unknown error 514", -ERESTARTNOHAND leaks to the
userland. In this case sys_pause() is buggy as well and should be
fixed.

I do not know what was the original rationality behind PTRACE_KILL.
The man page is simply wrong and afaics it was always wrong. Imho
it should be deprecated, or may be it should do send_sig(SIGKILL)
as Denys suggests, but in any case I do not think that the current
behaviour was intentional.

Note: there is another problem, ptrace_resume() changes ->exit_code
and this can race with SIGKILL too. Eventually we should change ptrace
to not use ->exit_code.
Signed-off-by: NOleg Nesterov <oleg@redhat.com>

0666fb51

25 4月, 2011 1 次提交

ptrace: Prepare to fix racy accesses on task breakpoints · bf26c018

由 Frederic Weisbecker 提交于 4月 07, 2011

When a task is traced and is in a stopped state, the tracer
may execute a ptrace request to examine the tracee state and
get its task struct. Right after, the tracee can be killed
and thus its breakpoints released.
This can happen concurrently when the tracer is in the middle
of reading or modifying these breakpoints, leading to dereferencing
a freed pointer.

Hence, to prepare the fix, create a generic breakpoint reference
holding API. When a reference on the breakpoints of a task is
held, the breakpoints won't be released until the last reference
is dropped. After that, no more ptrace request on the task's
breakpoints can be serviced for the tracer.
Reported-by: NOleg Nesterov <oleg@redhat.com>
Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Prasad <prasad@linux.vnet.ibm.com>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: v2.6.33.. <stable@kernel.org>
Link: http://lkml.kernel.org/r/1302284067-7860-2-git-send-email-fweisbec@gmail.com

bf26c018

04 4月, 2011 1 次提交

ptrace: ptrace_check_attach() should not do s/STOPPED/TRACED/ · 321fb561

由 Oleg Nesterov 提交于 4月 01, 2011

After "ptrace: Clean transitions between TASK_STOPPED and TRACED"
d79fdd6d, ptrace_check_attach()
should never see a TASK_STOPPED tracee and s/STOPPED/TRACED/ is
no longer legal. Add the warning.

Note: ptrace_check_attach() can be greatly simplified, in particular
it doesn't need tasklist. But I'd prefer another patch for that.
Signed-off-by: NOleg Nesterov <oleg@redhat.com>
Signed-off-by: NTejun Heo <tj@kernel.org>

321fb561

24 3月, 2011 1 次提交

userns: allow ptrace from non-init user namespaces · 8409cca7

由 Serge E. Hallyn 提交于 3月 23, 2011

ptrace is allowed to tasks in the same user namespace according to the
usual rules (i.e.  the same rules as for two tasks in the init user
namespace).  ptrace is also allowed to a user namespace to which the
current task the has CAP_SYS_PTRACE capability.

Changelog:
	Dec 31: Address feedback by Eric:
		. Correct ptrace uid check
		. Rename may_ptrace_ns to ptrace_capable
		. Also fix the cap_ptrace checks.
	Jan  1: Use const cred struct
	Jan 11: use task_ns_capable() in place of ptrace_capable().
	Feb 23: same_or_ancestore_user_ns() was not an appropriate
		check to constrain cap_issubset.  Rather, cap_issubset()
		only is meaningful when both capsets are in the same
		user_ns.
Signed-off-by: NSerge E. Hallyn <serge.hallyn@canonical.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Acked-by: NDaniel Lezcano <daniel.lezcano@free.fr>
Acked-by: NDavid Howells <dhowells@redhat.com>
Cc: James Morris <jmorris@namei.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

8409cca7

23 3月, 2011 4 次提交

ptrace: Always put ptracee into appropriate execution state · 0e9f0a4a

由 Tejun Heo 提交于 3月 23, 2011

Currently, __ptrace_unlink() wakes up the tracee iff it's in
TASK_TRACED.  For unlinking from PTRACE_DETACH, this is correct as the
tracee is guaranteed to be in TASK_TRACED or dead; however, unlinking
also happens when the ptracer exits and in this case the ptracee can
be in any state and ptrace might be left running even if the group it
belongs to is stopped.

This patch updates __ptrace_unlink() such that GROUP_STOP_PENDING is
reinstated regardless of the ptracee's current state as long as it's
alive and makes sure that signal_wake_up() is called if execution
state transition is necessary.

Test case follows.

  #include <unistd.h>
  #include <time.h>
  #include <sys/types.h>
  #include <sys/ptrace.h>
  #include <sys/wait.h>

  static const struct timespec ts1s = { .tv_sec = 1 };

  int main(void)
  {
	  pid_t tracee;
	  siginfo_t si;

	  tracee = fork();
	  if (tracee == 0) {
		  while (1) {
			  nanosleep(&ts1s, NULL);
			  write(1, ".", 1);
		  }
	  }

	  ptrace(PTRACE_ATTACH, tracee, NULL, NULL);
	  waitid(P_PID, tracee, &si, WSTOPPED);
	  ptrace(PTRACE_CONT, tracee, NULL, (void *)(long)si.si_status);
	  waitid(P_PID, tracee, &si, WSTOPPED);
	  ptrace(PTRACE_CONT, tracee, NULL, (void *)(long)si.si_status);
	  write(1, "exiting", 7);
	  return 0;
  }

Before the patch, after the parent process exits, the child is left
running and prints out "." every second.

  exiting..... (continues)

After the patch, the group stop initiated by the implied SIGSTOP from
PTRACE_ATTACH is re-established when the parent exits.

  exiting
Signed-off-by: NTejun Heo <tj@kernel.org>
Reported-by: NOleg Nesterov <oleg@redhat.com>
Acked-by: NOleg Nesterov <oleg@redhat.com>

0e9f0a4a

ptrace: Collapse ptrace_untrace() into __ptrace_unlink() · e3bd058f

由 Tejun Heo 提交于 3月 23, 2011

Remove the extra task_is_traced() check in __ptrace_unlink() and
collapse ptrace_untrace() into __ptrace_unlink().  This is to prepare
for further changes.

While at it, drop the comment on top of ptrace_untrace() and convert
__ptrace_unlink() comment to docbook format.  Detailed comment will be
added by the next patch.

This patch doesn't cause any visible behavior changes.
Signed-off-by: NTejun Heo <tj@kernel.org>
Acked-by: NOleg Nesterov <oleg@redhat.com>

e3bd058f

ptrace: Clean transitions between TASK_STOPPED and TRACED · d79fdd6d

由 Tejun Heo 提交于 3月 23, 2011

Currently, if the task is STOPPED on ptrace attach, it's left alone
and the state is silently changed to TRACED on the next ptrace call.
The behavior breaks the assumption that arch_ptrace_stop() is called
before any task is poked by ptrace and is ugly in that a task
manipulates the state of another task directly.

With GROUP_STOP_PENDING, the transitions between TASK_STOPPED and
TRACED can be made clean.  The tracer can use the flag to tell the
tracee to retry stop on attach and detach.  On retry, the tracee will
enter the desired state in the correct way.  The lower 16bits of
task->group_stop is used to remember the signal number which caused
the last group stop.  This is used while retrying for ptrace attach as
the original group_exit_code could have been consumed with wait(2) by
then.

As the real parent may wait(2) and consume the group_exit_code
anytime, the group_exit_code needs to be saved separately so that it
can be used when switching from regular sleep to ptrace_stop().  This
is recorded in the lower 16bits of task->group_stop.

If a task is already stopped and there's no intervening SIGCONT, a
ptrace request immediately following a successful PTRACE_ATTACH should
always succeed even if the tracer doesn't wait(2) for attach
completion; however, with this change, the tracee might still be
TASK_RUNNING trying to enter TASK_TRACED which would cause the
following request to fail with -ESRCH.

This intermediate state is hidden from the ptracer by setting
GROUP_STOP_TRAPPING on attach and making ptrace_check_attach() wait
for it to clear on its signal->wait_chldexit.  Completing the
transition or getting killed clears TRAPPING and wakes up the tracer.

Note that the STOPPED -> RUNNING -> TRACED transition is still visible
to other threads which are in the same group as the ptracer and the
reverse transition is visible to all.  Please read the comments for
details.

Oleg:

* Spotted a race condition where a task may retry group stop without
  proper bookkeeping.  Fixed by redoing bookkeeping on retry.

* Spotted that the transition is visible to userland in several
  different ways.  Most are fixed with GROUP_STOP_TRAPPING.  Unhandled
  corner case is documented.

* Pointed out not setting GROUP_STOP_SIGMASK on an already stopped
  task would result in more consistent behavior.

* Pointed out that calling ptrace_stop() from do_signal_stop() in
  TASK_STOPPED can race with group stop start logic and then confuse
  the TRAPPING wait in ptrace_check_attach().  ptrace_stop() is now
  called with TASK_RUNNING.

* Suggested using signal->wait_chldexit instead of bit wait.

* Spotted a race condition between TRACED transition and clearing of
  TRAPPING.
Signed-off-by: NTejun Heo <tj@kernel.org>
Acked-by: NOleg Nesterov <oleg@redhat.com>
Cc: Roland McGrath <roland@redhat.com>
Cc: Jan Kratochvil <jan.kratochvil@redhat.com>

d79fdd6d

ptrace: Remove the extra wake_up_state() from ptrace_detach() · 9f2bf651

由 Tejun Heo 提交于 3月 23, 2011

This wake_up_state() has a turbulent history.  This is a remnant from
ancient ptrace implementation and patently wrong.  Commit 95a3540d
(ptrace_detach: the wrong wakeup breaks the ERESTARTxxx logic) removed
it but the change was reverted later by commit edaba2c5 (ptrace:
revert "ptrace_detach: the wrong wakeup breaks the ERESTARTxxx logic")
citing compatibility breakage and general brokeness of the whole group
stop / ptrace interaction.  Then, recently, it got converted from
wake_up_process() to wake_up_state() to make it less dangerous.

Digging through the mailing archives, the compatibility breakage
doesn't seem to be critical in the sense that the behavior isn't well
defined or reliable to begin with and it seems to have been agreed to
remove the wakeup with proper cleanup of the whole thing.

Now that the group stop and its interaction with ptrace are being
cleaned up, it's high time to finally kill this silliness.
Signed-off-by: NTejun Heo <tj@kernel.org>
Acked-by: NOleg Nesterov <oleg@redhat.com>
Cc: Roland McGrath <roland@redhat.com>

9f2bf651

05 3月, 2011 1 次提交

Mark ptrace_{traceme,attach,detach} static · e3e89cc5

由 Linus Torvalds 提交于 3月 04, 2011

They are only used inside kernel/ptrace.c, and have been for a long
time. We don't want to go back to the bad-old-days when architectures
did things on their own, so make them static and private.
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

e3e89cc5

12 2月, 2011 1 次提交

ptrace: use safer wake up on ptrace_detach() · 01e05e9a

由 Tejun Heo 提交于 2月 10, 2011

The wake_up_process() call in ptrace_detach() is spurious and not
interlocked with the tracee state.  IOW, the tracee could be running or
sleeping in any place in the kernel by the time wake_up_process() is
called.  This can lead to the tracee waking up unexpectedly which can be
dangerous.

The wake_up is spurious and should be removed but for now reduce its
toxicity by only waking up if the tracee is in TRACED or STOPPED state.

This bug can possibly be used as an attack vector.  I don't think it
will take too much effort to come up with an attack which triggers oops
somewhere.  Most sleeps are wrapped in condition test loops and should
be safe but we have quite a number of places where sleep and wakeup
conditions are expected to be interlocked.  Although the window of
opportunity is tiny, ptrace can be used by non-privileged users and with
some loading the window can definitely be extended and exploited.
Signed-off-by: NTejun Heo <tj@kernel.org>
Acked-by: NRoland McGrath <roland@redhat.com>
Acked-by: NOleg Nesterov <oleg@redhat.com>
Cc: <stable@kernel.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

01e05e9a

28 10月, 2010 4 次提交

signals: move cred_guard_mutex from task_struct to signal_struct · 9b1bf12d

由 KOSAKI Motohiro 提交于 10月 27, 2010

Oleg Nesterov pointed out we have to prevent multiple-threads-inside-exec
itself and we can reuse ->cred_guard_mutex for it.  Yes, concurrent
execve() has no worth.

Let's move ->cred_guard_mutex from task_struct to signal_struct.  It
naturally prevent multiple-threads-inside-exec.
Signed-off-by: NKOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Reviewed-by: NOleg Nesterov <oleg@redhat.com>
Acked-by: NRoland McGrath <roland@redhat.com>
Acked-by: NDavid Howells <dhowells@redhat.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

9b1bf12d

ptrace: cleanup ptrace_request() · 9fed81dc

由 Namhyung Kim 提交于 10月 27, 2010

Use new 'datavp' and 'datalp' variables to remove unnecesary castings.
Signed-off-by: NNamhyung Kim <namhyung@gmail.com>
Acked-by: NRoland McGrath <roland@redhat.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

9fed81dc

ptrace: change signature of sys_ptrace() and friends · 4abf9869

由 Namhyung Kim 提交于 10月 27, 2010

Since userspace API of ptrace syscall defines @addr and @data as void
pointers, it would be more appropriate to define them as unsigned long in
kernel.  Therefore related functions are changed also.

'unsigned long' is typically used in other places in kernel as an opaque
data type and that using this helps cleaning up a lot of warnings from
sparse.
Suggested-by: NArnd Bergmann <arnd@arndb.de>
Signed-off-by: NNamhyung Kim <namhyung@gmail.com>
Acked-by: NArnd Bergmann <arnd@arndb.de>
Acked-by: NRoland McGrath <roland@redhat.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

4abf9869

ptrace: annotate lock context change on exit_ptrace() · c4b5ed25

由 Namhyung Kim 提交于 10月 27, 2010

exit_ptrace() releases and regrabs tasklist_lock but was missing proper
annotation.  Add it.
Signed-off-by: NNamhyung Kim <namhyung@gmail.com>
Acked-by: NRoland McGrath <roland@redhat.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Oleg Nesterov <oleg@redhat.com
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

c4b5ed25

11 8月, 2010 1 次提交

ptrace: optimize exit_ptrace() for the likely case · c7e49c14

由 Oleg Nesterov 提交于 8月 10, 2010

exit_ptrace() takes tasklist_lock unconditionally.  We need this lock to
avoid the race with ptrace_traceme(), it acts as a barrier.

Change its caller, forget_original_parent(), to call exit_ptrace() under
tasklist_lock.  Change exit_ptrace() to drop and reacquire this lock if
needed.

This allows us to add the fastpath list_empty(ptraced) check.  In the
likely no-tracees case exit_ptrace() just returns and we avoid the lock()
+ unlock() sequence.

"Zhang, Yanmin" <yanmin_zhang@linux.intel.com> suggested to add this
check, and he reports that this change adds about 11% improvement in some
tests.
Suggested-and-tested-by: N"Zhang, Yanmin" <yanmin_zhang@linux.intel.com>
Signed-off-by: NOleg Nesterov <oleg@redhat.com>
Acked-by: NRoland McGrath <roland@redhat.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

c7e49c14

28 5月, 2010 2 次提交

ptrace: PTRACE_GETFDPIC: fix the unsafe usage of child->mm · e0129ef9

由 Oleg Nesterov 提交于 5月 26, 2010

Now that Mike Frysinger unified the FDPIC ptrace code, we can fix the
unsafe usage of child->mm in ptrace_request(PTRACE_GETFDPIC).

We have the reference to task_struct, and ptrace_check_attach() verified
the tracee is stopped.  But nothing can protect from SIGKILL after that,
we must not assume child->mm != NULL.
Signed-off-by: NOleg Nesterov <oleg@redhat.com>
Acked-by: NMike Frysinger <vapier.adi@gmail.com>
Acked-by: NDavid Howells <dhowells@redhat.com>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: Greg Ungerer <gerg@snapgear.com>
Acked-by: NRoland McGrath <roland@redhat.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

e0129ef9

ptrace: unify FDPIC implementations · 9c1a1259

由 Mike Frysinger 提交于 5月 26, 2010

The Blackfin/FRV/SuperH guys all have the same exact FDPIC ptrace code in
their arch handlers (since they were probably copied & pasted).  Since
these ptrace interfaces are an arch independent aspect of the FDPIC code,
unify them in the common ptrace code so new FDPIC ports don't need to copy
and paste this fundamental stuff yet again.
Signed-off-by: NMike Frysinger <vapier@gentoo.org>
Acked-by: NRoland McGrath <roland@redhat.com>
Acked-by: NDavid Howells <dhowells@redhat.com>
Acked-by: NPaul Mundt <lethal@linux-sh.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

9c1a1259

27 4月, 2010 1 次提交

ptrace: Cleanup useless header · b8bc1389

由 Alessio Igor Bogani 提交于 4月 25, 2010

BKL isn't present anymore into this file thus we can safely remove
smp_lock.h inclusion.
Signed-off-by: NAlessio Igor Bogani <abogani@texware.it>
Cc: Roland McGrath <roland@redhat.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: James Morris <jmorris@namei.org>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>

b8bc1389

10 4月, 2010 1 次提交

ptrace: kill BKL in ptrace syscall · 5534ecb2

由 Arnd Bergmann 提交于 2月 27, 2010

The comment suggests that this usage is stale. There is no bkl in the
exec path so if there is a race lurking there, the bkl in ptrace is
not going to help in this regard.

Overview of the possibility of "accidental" races this bkl might
protect:

- ptrace_traceme() is protected against task removal and concurrent
read/write on current->ptrace as it locks write tasklist_lock.

- arch_ptrace_attach() is serialized by ptrace_traceme() against
concurrent PTRACE_TRACEME or PTRACE_ATTACH

- ptrace_attach() is protected the same way ptrace_traceme() and
in turn serializes arch_ptrace_attach()

- ptrace_check_attach() does its own well described serializing too.

There is no obvious race here.
Signed-off-by: NArnd Bergmann <arnd@arndb.de>
Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
Acked-by: NOleg Nesterov <oleg@redhat.com>
Acked-by: NRoland McGrath <roland@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Roland McGrath <roland@redhat.com>

5534ecb2

26 3月, 2010 1 次提交

x86, perf, bts, mm: Delete the never used BTS-ptrace code · faa4602e

由 Peter Zijlstra 提交于 3月 25, 2010

Support for the PMU's BTS features has been upstreamed in
v2.6.32, but we still have the old and disabled ptrace-BTS,
as Linus noticed it not so long ago.

It's buggy: TIF_DEBUGCTLMSR is trampling all over that MSR without
regard for other uses (perf) and doesn't provide the flexibility
needed for perf either.

Its users are ptrace-block-step and ptrace-bts, since ptrace-bts
was never used and ptrace-block-step can be implemented using a
much simpler approach.

So axe all 3000 lines of it. That includes the *locked_memory*()
APIs in mm/mlock.c as well.
Reported-by: NLinus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Roland McGrath <roland@redhat.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Markus Metzger <markus.t.metzger@intel.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
LKML-Reference: <20100325135413.938004390@chello.nl>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

faa4602e

24 2月, 2010 1 次提交

ptrace: Fix ptrace_regset() comments and diagnose errors specifically · c6a0dd7e

由 Suresh Siddha 提交于 2月 22, 2010

Return -EINVAL for the bad size and for unrecognized NT_* type in
ptrace_regset() instead of -EIO.

Also update the comments for this ptrace interface with more clarifications.
Requested-by: NRoland McGrath <roland@redhat.com>
Requested-by: NOleg Nesterov <oleg@redhat.com>
Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com>
LKML-Reference: <20100222225240.397523600@sbs-t61.sc.intel.com>
Acked-by: NRoland McGrath <roland@redhat.com>
Signed-off-by: NH. Peter Anvin <hpa@zytor.com>

c6a0dd7e

12 2月, 2010 1 次提交

ptrace: Add support for generic PTRACE_GETREGSET/PTRACE_SETREGSET · 2225a122

由 Suresh Siddha 提交于 2月 11, 2010

Generic support for PTRACE_GETREGSET/PTRACE_SETREGSET commands which
export the regsets supported by each architecture using the correponding
NT_* types. These NT_* types are already part of the userland ABI, used
in representing the architecture specific register sets as different NOTES
in an ELF core file.

'addr' parameter for the ptrace system call encode the REGSET type (using
the corresppnding NT_* type) and the 'data' parameter points to the
struct iovec having the user buffer and the length of that buffer.

	struct iovec iov = { buf, len};
	ret = ptrace(PTRACE_GETREGSET/PTRACE_SETREGSET, pid, NT_XXX_TYPE, &iov);

On successful completion, iov.len will be updated by the kernel specifying
how much the kernel has written/read to/from the user's iov.buf.

x86 extended state registers are primarily exported using this interface.
Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com>
LKML-Reference: <20100211195614.886724710@sbs-t61.sc.intel.com>
Acked-by: NHongjiu Lu <hjl.tools@gmail.com>
Cc: Roland McGrath <roland@redhat.com>
Signed-off-by: NH. Peter Anvin <hpa@zytor.com>

2225a122

24 9月, 2009 1 次提交

ptrace: __ptrace_detach: do __wake_up_parent() if we reap the tracee · a7f0765e

由 Oleg Nesterov 提交于 9月 23, 2009

The bug is old, it wasn't cause by recent changes.

Test case:

	static void *tfunc(void *arg)
	{
		int pid = (long)arg;

		assert(ptrace(PTRACE_ATTACH, pid, NULL, NULL) == 0);
		kill(pid, SIGKILL);

		sleep(1);
		return NULL;
	}

	int main(void)
	{
		pthread_t th;
		long pid = fork();

		if (!pid)
			pause();

		signal(SIGCHLD, SIG_IGN);
		assert(pthread_create(&th, NULL, tfunc, (void*)pid) == 0);

		int r = waitpid(-1, NULL, __WNOTHREAD);
		printf("waitpid: %d %m\n", r);

		return 0;
	}

Before the patch this program hangs, after this patch waitpid() correctly
fails with errno == -ECHILD.

The problem is, __ptrace_detach() reaps the EXIT_ZOMBIE tracee if its
->real_parent is our sub-thread and we ignore SIGCHLD.  But in this case
we should wake up other threads which can sleep in do_wait().
Signed-off-by: NOleg Nesterov <oleg@redhat.com>
Cc: Roland McGrath <roland@redhat.com>
Cc: Vitaly Mayatskikh <vmayatsk@redhat.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

a7f0765e

07 7月, 2009 1 次提交

cred_guard_mutex: do not return -EINTR to user-space · 793285fc

由 Oleg Nesterov 提交于 7月 05, 2009

do_execve() and ptrace_attach() return -EINTR if
mutex_lock_interruptible(->cred_guard_mutex) fails.

This is not right, change the code to return ERESTARTNOINTR.

Perhaps we should also change proc_pid_attr_write().
Signed-off-by: NOleg Nesterov <oleg@redhat.com>
Cc: David Howells <dhowells@redhat.com>
Acked-by: NRoland McGrath <roland@redhat.com>
Cc: James Morris <jmorris@namei.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

793285fc

24 6月, 2009 1 次提交

security: rename ptrace_may_access => ptrace_access_check · 9e48858f

由 Ingo Molnar 提交于 5月 07, 2009

The ->ptrace_may_access() methods are named confusingly - the real
ptrace_may_access() returns a bool, while these security checks have
a retval convention.

Rename it to ptrace_access_check, to reduce the confusion factor.

[ Impact: cleanup, no code changed ]
Signed-off-by: NIngo Molnar <mingo@elte.hu>
Signed-off-by: NJames Morris <jmorris@namei.org>

9e48858f

19 6月, 2009 5 次提交

ptrace: don't take tasklist to get/set ->last_siginfo · e4961254

由 Oleg Nesterov 提交于 6月 17, 2009

Change ptrace_getsiginfo/ptrace_setsiginfo to use lock_task_sighand()
without tasklist_lock.  Perhaps it makes sense to make a single helper
with "bool rw" argument.
Signed-off-by: NOleg Nesterov <oleg@redhat.com>
Acked-by: NRoland McGrath <roland@redhat.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

e4961254

ptrace_get_task_struct: s/tasklist/rcu/, make it static · 8053bdd5

由 Oleg Nesterov 提交于 6月 17, 2009

- Use rcu_read_lock() instead of tasklist_lock to find/get the task
  in ptrace_get_task_struct().

- Make it static, it has no callers outside of ptrace.c.

- The comment doesn't match the reality, this helper does not do
  any checks. Beacuse it is really trivial and static I removed the
  whole comment.
Signed-off-by: NOleg Nesterov <oleg@redhat.com>
Acked-by: NRoland McGrath <roland@redhat.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

8053bdd5

ptrace: do not use task_lock() for attach · 4b105cbb

由 Oleg Nesterov 提交于 6月 17, 2009

Remove the "Nasty, nasty" lock dance in ptrace_attach()/ptrace_traceme() -
from now task_lock() has nothing to do with ptrace at all.

With the recent changes nobody uses task_lock() to serialize with ptrace,
but in fact it was never needed and it was never used consistently.

However ptrace_attach() calls __ptrace_may_access() and needs task_lock()
to pin task->mm for get_dumpable().  But we can call __ptrace_may_access()
before we take tasklist_lock, ->cred_exec_mutex protects us against
do_execve() path which can change creds and MMF_DUMP* flags.

(ugly, but we can't use ptrace_may_access() because it hides the error
code, so we have to take task_lock() and use __ptrace_may_access()).

NOTE: this change assumes that LSM hooks, security_ptrace_may_access() and
security_ptrace_traceme(), can be called without task_lock() held.
Signed-off-by: NOleg Nesterov <oleg@redhat.com>
Cc: Chris Wright <chrisw@sous-sol.org>
Acked-by: NRoland McGrath <roland@redhat.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

4b105cbb

ptrace: cleanup check/set of PT_PTRACED during attach · f2f0b00a

由 Oleg Nesterov 提交于 6月 17, 2009

ptrace_attach() and ptrace_traceme() are the last functions which look as
if the untraced task can have task->ptrace != 0, this must not be
possible.  Change the code to just check ->ptrace != 0 and s/|=/=/ to set
PT_PTRACED.

Also, a couple of trivial whitespace cleanups in ptrace_attach().

And move ptrace_traceme() up near ptrace_attach() to keep them close to
each other.
Signed-off-by: NOleg Nesterov <oleg@redhat.com>
Cc: Chris Wright <chrisw@sous-sol.org>
Acked-by: NRoland McGrath <roland@redhat.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

f2f0b00a

ptrace: ptrace_attach: check PF_KTHREAD + exit_state instead of ->mm · b79b7ba9

由 Oleg Nesterov 提交于 6月 17, 2009

- Add PF_KTHREAD check to prevent attaching to the kernel thread
  with a borrowed ->mm.

  With or without this change we can race with daemonize() which
  can set PF_KTHREAD or clear ->mm after ptrace_attach() does the
  check, but this doesn't matter because reparent_to_kthreadd()
  does ptrace_unlink().

- Kill "!task->mm" check. We don't really care about ->mm != NULL,
  and the task can call exit_mm() right after we drop task_lock().
  What we need is to make sure we can't attach after exit_notify(),
  check task->exit_state != 0 instead.

Also, move the "already traced" check down for cosmetic reasons.
Signed-off-by: NOleg Nesterov <oleg@redhat.com>
Cc: Chris Wright <chrisw@sous-sol.org>
Acked-by: NRoland McGrath <roland@redhat.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

b79b7ba9

05 6月, 2009 1 次提交

ptrace: revert "ptrace_detach: the wrong wakeup breaks the ERESTARTxxx logic" · edaba2c5

由 Oleg Nesterov 提交于 6月 04, 2009

Commit 95a3540d ("ptrace_detach: the wrong
wakeup breaks the ERESTARTxxx logic") removed the "extra"
wake_up_process() from ptrace_detach(), but as Jan pointed out this breaks
the compatibility.

I believe the changelog is right and this wake_up() is wrong in many
ways, but GDB assumes that ptrace(PTRACE_DETACH, child, 0, 0) always
wakes up the tracee.

Despite the fact this breaks SIGNAL_STOP_STOPPED/group_stop_count logic,
and despite the fact this wake_up_process() can break another
assumption: PTRACE_DETACH with SIGSTOP should leave the tracee in
TASK_STOPPED case.  Because the untraced child can dequeue SIGSTOP and
call do_signal_stop() before ptrace_detach() calls wake_up_process().

Revert this change for now.  We need some fixes even if we we want to keep
the current behaviour, but these fixes are not for 2.6.30.
Signed-off-by: NOleg Nesterov <oleg@redhat.com>
Acked-by: NRoland McGrath <roland@redhat.com>
Cc: Jan Kratochvil <jan.kratochvil@redhat.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

edaba2c5

11 5月, 2009 1 次提交

CRED: Rename cred_exec_mutex to reflect that it's a guard against ptrace · 5e751e99

由 David Howells 提交于 5月 08, 2009

Rename cred_exec_mutex to reflect that it's a guard against foreign
intervention on a process's credential state, such as is made by ptrace(). The
attachment of a debugger to a process affects execve()'s calculation of the new
credential state - _and_ also setprocattr()'s calculation of that state.
Signed-off-by: NDavid Howells <dhowells@redhat.com>
Signed-off-by: NJames Morris <jmorris@namei.org>

5e751e99

27 4月, 2009 1 次提交

ptrace: ptrace_attach: fix the usage of ->cred_exec_mutex · cad81bc2

由 Oleg Nesterov 提交于 4月 27, 2009

ptrace_attach() needs task->cred_exec_mutex, not current->cred_exec_mutex.
Signed-off-by: NOleg Nesterov <oleg@redhat.com>
Acked-by: NRoland McGrath <roland@redhat.com>
Acked-by: NDavid Howells <dhowells@redhat.com>
Signed-off-by: NJames Morris <jmorris@namei.org>

cad81bc2

14 4月, 2009 1 次提交

ptrace: fix exit_ptrace() vs ptrace_traceme() race · f1671f6d

由 Oleg Nesterov 提交于 4月 13, 2009

Pointed out by Roland.  The bug was recently introduced by me in
"forget_original_parent: split out the un-ptrace part", commit
39c626ae.

Since that patch we have a window after exit_ptrace() drops tasklist and
before forget_original_parent() takes it again.  In this window the child
can do ptrace(PTRACE_TRACEME) and nobody can untrace this child after
that.

Change ptrace_traceme() to not attach to the exiting ->real_parent.  We
don't report the error in this case, we pretend we attach right before
->real_parent calls exit_ptrace() which should untrace us anyway.
Signed-off-by: NOleg Nesterov <oleg@redhat.com>
Acked-by: NRoland McGrath <roland@redhat.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

f1671f6d

09 4月, 2009 1 次提交

ptrace: some checkpatch fixes · 3a709703

由 Roland McGrath 提交于 4月 07, 2009

This fixes all the checkpatch --file complaints about kernel/ptrace.c
and also removes an unused #include.  I've verified that there are no
changes to the compiled code on x86_64.
Signed-off-by: NRoland McGrath <roland@redhat.com>
[ Removed the parts that just split a line  - Linus ]
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

3a709703

07 4月, 2009 1 次提交

x86, ptrace: add bts context unconditionally · 0f481406

由 Markus Metzger 提交于 4月 03, 2009

Add the ptrace bts context field to task_struct unconditionally.

Initialize the field directly in copy_process().
Remove all the unneeded functionality used to initialize that field.
Signed-off-by: NMarkus Metzger <markus.t.metzger@intel.com>
Cc: roland@redhat.com
Cc: eranian@googlemail.com
Cc: oleg@redhat.com
Cc: juan.villacis@intel.com
Cc: ak@linux.jf.intel.com
LKML-Reference: <20090403144603.292754000@intel.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

0f481406

03 4月, 2009 4 次提交

ptrace_untrace: fix the SIGNAL_STOP_STOPPED check · 1ee11844

由 Oleg Nesterov 提交于 4月 02, 2009

This bug is ancient too. ptrace_untrace() must not resume the task
if the group stop in progress, we should set TASK_STOPPED instead.

Unfortunately, we still have problems here:

	- if the process/thread was traced, SIGNAL_STOP_STOPPED
	  does not necessary means this thread group is stopped.

	- ptrace breaks the bookkeeping of ->group_stop_count.
Signed-off-by: NOleg Nesterov <oleg@redhat.com>
Cc: Jerome Marchand <jmarchan@redhat.com>
Cc: Roland McGrath <roland@redhat.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

1ee11844

ptrace_detach: the wrong wakeup breaks the ERESTARTxxx logic · 95a3540d

由 Oleg Nesterov 提交于 4月 02, 2009

Another ancient bug. Consider this trivial test-case,

	int main(void)
	{
		int pid = fork();

		if (pid) {
			ptrace(PTRACE_ATTACH, pid, NULL, NULL);
			wait(NULL);
			ptrace(PTRACE_DETACH, pid, NULL, NULL);
		} else {
			pause();
			printf("WE HAVE A KERNEL BUG!!!\n");
		}

		return 0;
	}

the child must not "escape" for sys_pause(), but it can and this was seen
in practice.

This is because ptrace_detach does:

	if (!child->exit_state)
		wake_up_process(child);

this wakeup can happen after this child has already restarted sys_pause(),
because it gets another wakeup from ptrace_untrace().

With or without this patch, perhaps sys_pause() needs a fix.  But this
wakeup also breaks the SIGNAL_STOP_STOPPED logic in ptrace_untrace().

Remove this wakeup.  The caller saw this task in TASK_TRACED state, and
unless it was SIGKILL'ed in between __ptrace_unlink()->ptrace_untrace()
should handle this case correctly.  If it was SIGKILL'ed, we don't need to
wakup the dying tracee too.
Signed-off-by: NOleg Nesterov <oleg@redhat.com>
Cc: Jerome Marchand <jmarchan@redhat.com>
Acked-by: NRoland McGrath <roland@redhat.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

95a3540d

forget_original_parent: split out the un-ptrace part · 39c626ae

由 Oleg Nesterov 提交于 4月 02, 2009

By discussion with Roland.

- Rename ptrace_exit() to exit_ptrace(), and change it to do all the
  necessary work with ->ptraced list by its own.

- Move this code from exit.c to ptrace.c

- Update the comment in ptrace_detach() to explain the rechecking of
  the child->ptrace.
Signed-off-by: NOleg Nesterov <oleg@redhat.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: "Metzger, Markus T" <markus.t.metzger@intel.com>
Cc: Roland McGrath <roland@redhat.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

39c626ae

ptrace: fix possible zombie leak on PTRACE_DETACH · 4576145c

由 Oleg Nesterov 提交于 4月 02, 2009

When ptrace_detach() takes tasklist, the tracee can be SIGKILL'ed.  If it
has already passed exit_notify() we can leak a zombie, because a) ptracing
disables the auto-reaping logic, and b) ->real_parent was not notified
about the child's death.

ptrace_detach() should follow the ptrace_exit's logic, change the code
accordingly.
Signed-off-by: NOleg Nesterov <oleg@redhat.com>
Cc: Jerome Marchand <jmarchan@redhat.com>
Cc: Roland McGrath <roland@redhat.com>
Tested-by: NDenys Vlasenko <dvlasenk@redhat.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

4576145c