1. 05 1月, 2012 2 次提交
    • O
      ptrace: ensure JOBCTL_STOP_SIGMASK is not zero after detach · 8a88951b
      Oleg Nesterov 提交于
      This is the temporary simple fix for 3.2, we need more changes in this
      area.
      
      1. do_signal_stop() assumes that the running untraced thread in the
         stopped thread group is not possible. This was our goal but it is
         not yet achieved: a stopped-but-resumed tracee can clone the running
         thread which can initiate another group-stop.
      
         Remove WARN_ON_ONCE(!current->ptrace).
      
      2. A new thread always starts with ->jobctl = 0. If it is auto-attached
         and this group is stopped, __ptrace_unlink() sets JOBCTL_STOP_PENDING
         but JOBCTL_STOP_SIGMASK part is zero, this triggers WANR_ON(!signr)
         in do_jobctl_trap() if another debugger attaches.
      
         Change __ptrace_unlink() to set the artificial SIGSTOP for report.
      
         Alternatively we could change ptrace_init_task() to copy signr from
         current, but this means we can copy it for no reason and hide the
         possible similar problems.
      Acked-by: NTejun Heo <tj@kernel.org>
      Cc: <stable@kernel.org>		[3.1]
      Signed-off-by: NOleg Nesterov <oleg@redhat.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      8a88951b
    • O
      ptrace: partially fix the do_wait(WEXITED) vs EXIT_DEAD->EXIT_ZOMBIE race · 50b8d257
      Oleg Nesterov 提交于
      Test-case:
      
      	int main(void)
      	{
      		int pid, status;
      
      		pid = fork();
      		if (!pid) {
      			for (;;) {
      				if (!fork())
      					return 0;
      				if (waitpid(-1, &status, 0) < 0) {
      					printf("ERR!! wait: %m\n");
      					return 0;
      				}
      			}
      		}
      
      		assert(ptrace(PTRACE_ATTACH, pid, 0,0) == 0);
      		assert(waitpid(-1, NULL, 0) == pid);
      
      		assert(ptrace(PTRACE_SETOPTIONS, pid, 0,
      					PTRACE_O_TRACEFORK) == 0);
      
      		do {
      			ptrace(PTRACE_CONT, pid, 0, 0);
      			pid = waitpid(-1, NULL, 0);
      		} while (pid > 0);
      
      		return 1;
      	}
      
      It fails because ->real_parent sees its child in EXIT_DEAD state
      while the tracer is going to change the state back to EXIT_ZOMBIE
      in wait_task_zombie().
      
      The offending commit is 823b018e which moved the EXIT_DEAD check,
      but in fact we should not blame it. The original code was not
      correct as well because it didn't take ptrace_reparented() into
      account and because we can't really trust ->ptrace.
      
      This patch adds the additional check to close this particular
      race but it doesn't solve the whole problem. We simply can't
      rely on ->ptrace in this case, it can be cleared if the tracer
      is multithreaded by the exiting ->parent.
      
      I think we should kill EXIT_DEAD altogether, we should always
      remove the soon-to-be-reaped child from ->children or at least
      we should never do the DEAD->ZOMBIE transition. But this is too
      complex for 3.2.
      Reported-and-tested-by: NDenys Vlasenko <vda.linux@googlemail.com>
      Tested-by: NLukasz Michalik <lmi@ift.uni.wroc.pl>
      Acked-by: NTejun Heo <tj@kernel.org>
      Cc: <stable@kernel.org>		[3.0+]
      Signed-off-by: NOleg Nesterov <oleg@redhat.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      50b8d257
  2. 04 1月, 2012 1 次提交
  3. 01 1月, 2012 1 次提交
    • H
      futex: Fix uninterruptible loop due to gate_area · e6780f72
      Hugh Dickins 提交于
      It was found (by Sasha) that if you use a futex located in the gate
      area we get stuck in an uninterruptible infinite loop, much like the
      ZERO_PAGE issue.
      
      While looking at this problem, PeterZ realized you'll get into similar
      trouble when hitting any install_special_pages() mapping.  And are there
      still drivers setting up their own special mmaps without page->mapping,
      and without special VM or pte flags to make get_user_pages fail?
      
      In most cases, if page->mapping is NULL, we do not need to retry at all:
      Linus points out that even /proc/sys/vm/drop_caches poses no problem,
      because it ends up using remove_mapping(), which takes care not to
      interfere when the page reference count is raised.
      
      But there is still one case which does need a retry: if memory pressure
      called shmem_writepage in between get_user_pages_fast dropping page
      table lock and our acquiring page lock, then the page gets switched from
      filecache to swapcache (and ->mapping set to NULL) whatever the refcount.
      Fault it back in to get the page->mapping needed for key->shared.inode.
      Reported-by: NSasha Levin <levinsasha928@gmail.com>
      Signed-off-by: NHugh Dickins <hughd@google.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      e6780f72
  4. 31 12月, 2011 1 次提交
  5. 21 12月, 2011 3 次提交
    • P
      lockdep/waitqueues: Add better annotation · f07fdec5
      Peter Zijlstra 提交于
       -> #2 (&tty->write_wait){-.-...}:
      
      is a lot more informative than:
      
       -> #2 (key#19){-.....}:
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Link: http://lkml.kernel.org/n/tip-8zpopbny51023rdb0qq67eye@git.kernel.orgSigned-off-by: NIngo Molnar <mingo@elte.hu>
      f07fdec5
    • M
      binary_sysctl(): fix memory leak · 3d3c8f93
      Michel Lespinasse 提交于
      binary_sysctl() calls sysctl_getname() which allocates from names_cache
      slab usin __getname()
      
      The matching function to free the name is __putname(), and not putname()
      which should be used only to match getname() allocations.
      
      This is because when auditing is enabled, putname() calls audit_putname
      *instead* (not in addition) to __putname().  Then, if a syscall is in
      progress, audit_putname does not release the name - instead, it expects
      the name to get released when the syscall completes, but that will happen
      only if audit_getname() was called previously, i.e.  if the name was
      allocated with getname() rather than the naked __getname().  So,
      __getname() followed by putname() ends up leaking memory.
      Signed-off-by: NMichel Lespinasse <walken@google.com>
      Acked-by: NAl Viro <viro@zeniv.linux.org.uk>
      Cc: Christoph Hellwig <hch@infradead.org>
      Cc: Eric Paris <eparis@redhat.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      3d3c8f93
    • D
      cpusets: stall when updating mems_allowed for mempolicy or disjoint nodemask · b246272e
      David Rientjes 提交于
      Kernels where MAX_NUMNODES > BITS_PER_LONG may temporarily see an empty
      nodemask in a tsk's mempolicy if its previous nodemask is remapped onto a
      new set of allowed cpuset nodes where the two nodemasks, as a result of
      the remap, are now disjoint.
      
      c0ff7453 ("cpuset,mm: fix no node to alloc memory when changing
      cpuset's mems") adds get_mems_allowed() to prevent the set of allowed
      nodes from changing for a thread.  This causes any update to a set of
      allowed nodes to stall until put_mems_allowed() is called.
      
      This stall is unncessary, however, if at least one node remains unchanged
      in the update to the set of allowed nodes.  This was addressed by
      89e8a244 ("cpusets: avoid looping when storing to mems_allowed if one
      node remains set"), but it's still possible that an empty nodemask may be
      read from a mempolicy because the old nodemask may be remapped to the new
      nodemask during rebind.  To prevent this, only avoid the stall if there is
      no mempolicy for the thread being changed.
      
      This is a temporary solution until all reads from mempolicy nodemasks can
      be guaranteed to not be empty without the get_mems_allowed()
      synchronization.
      
      Also moves the check for nodemask intersection inside task_lock() so that
      tsk->mems_allowed cannot change.  This ensures that nothing can set this
      tsk's mems_allowed out from under us and also protects tsk->mempolicy.
      Reported-by: NMiao Xie <miaox@cn.fujitsu.com>
      Signed-off-by: NDavid Rientjes <rientjes@google.com>
      Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Cc: Paul Menage <paul@paulmenage.org>
      Cc: Stephen Rothwell <sfr@canb.auug.org.au>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      b246272e
  6. 20 12月, 2011 1 次提交
    • M
      cgroups: fix a css_set not found bug in cgroup_attach_proc · e0197aae
      Mandeep Singh Baines 提交于
      There is a BUG when migrating a PF_EXITING proc. Since css_set_prefetch()
      is not called for the PF_EXITING case, find_existing_css_set() will return
      NULL inside cgroup_task_migrate() causing a BUG.
      
      This bug is easy to reproduce. Create a zombie and echo its pid to
      cgroup.procs.
      
      $ cat zombie.c
      \#include <unistd.h>
      
      int main()
      {
        if (fork())
            pause();
        return 0;
      }
      $
      
      We are hitting this bug pretty regularly on ChromeOS.
      
      This bug is already fixed by Tejun Heo's cgroup patchset which is
      targetted for the next merge window:
      
      https://lkml.org/lkml/2011/11/1/356
      
      I've create a smaller patch here which just fixes this bug so that a
      fix can be merged into the current release and stable.
      Signed-off-by: NMandeep Singh Baines <msb@chromium.org>
      Downstream-Bug-Report: http://crosbug.com/23953Reviewed-by: NLi Zefan <lizf@cn.fujitsu.com>
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: containers@lists.linux-foundation.org
      Cc: cgroups@vger.kernel.org
      Cc: stable@kernel.org
      Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Paul Menage <paul@paulmenage.org>
      Cc: Olof Johansson <olofj@chromium.org>
      e0197aae
  7. 19 12月, 2011 1 次提交
  8. 16 12月, 2011 1 次提交
  9. 14 12月, 2011 1 次提交
  10. 13 12月, 2011 1 次提交
  11. 12 12月, 2011 27 次提交