1. 08 4月, 2014 3 次提交
  2. 20 3月, 2014 1 次提交
  3. 11 3月, 2014 1 次提交
  4. 24 1月, 2014 4 次提交
    • O
      proc: fix ->f_pos overflows in first_tid() · 9f6e963f
      Oleg Nesterov 提交于
      1. proc_task_readdir()->first_tid() path truncates f_pos to int, this
         is wrong even on 64bit.
      
         We could check that f_pos < PID_MAX or even INT_MAX in
         proc_task_readdir(), but this patch simply checks the potential
         overflow in first_tid(), this check is nop on 64bit.  We do not care if
         it was negative and the new unsigned value is huge, all we need to
         ensure is that we never wrongly return !NULL.
      
      2. Remove the 2nd "nr != 0" check before get_nr_threads(),
         nr_threads == 0 is not distinguishable from !pid_task() above.
      Signed-off-by: NOleg Nesterov <oleg@redhat.com>
      Cc: "Eric W. Biederman" <ebiederm@xmission.com>
      Cc: Michal Hocko <mhocko@suse.cz>
      Cc: Sameer Nanda <snanda@chromium.org>
      Cc: Sergey Dyasly <dserrg@gmail.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      9f6e963f
    • O
      proc: don't (ab)use ->group_leader in proc_task_readdir() paths · d855a4b7
      Oleg Nesterov 提交于
      proc_task_readdir() does not really need "leader", first_tid() has to
      revalidate it anyway.  Just pass proc_pid(inode) to first_tid() instead,
      it can do pid_task(PIDTYPE_PID) itself and read ->group_leader only if
      necessary.
      
      The patch also extracts the "inode is dead" code from
      pid_delete_dentry(dentry) into the new trivial helper,
      proc_inode_is_dead(inode), proc_task_readdir() uses it to return -ENOENT
      if this dir was removed.
      
      This is a bit racy, but the race is very inlikely and the getdents() after
      openndir() can see the empty "." + ".." dir only once.
      Signed-off-by: NOleg Nesterov <oleg@redhat.com>
      Cc: "Eric W. Biederman" <ebiederm@xmission.com>
      Cc: Michal Hocko <mhocko@suse.cz>
      Cc: Sameer Nanda <snanda@chromium.org>
      Cc: Sergey Dyasly <dserrg@gmail.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      d855a4b7
    • O
      proc: change first_tid() to use while_each_thread() rather than next_thread() · c986c14a
      Oleg Nesterov 提交于
      Rerwrite the main loop to use while_each_thread() instead of
      next_thread().  We are going to fix or replace while_each_thread(),
      next_thread() should be avoided whenever possible.
      Signed-off-by: NOleg Nesterov <oleg@redhat.com>
      Cc: "Eric W. Biederman" <ebiederm@xmission.com>
      Cc: Michal Hocko <mhocko@suse.cz>
      Cc: Sameer Nanda <snanda@chromium.org>
      Cc: Sergey Dyasly <dserrg@gmail.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      c986c14a
    • O
      proc: fix the potential use-after-free in first_tid() · 940fe479
      Oleg Nesterov 提交于
      proc_task_readdir() verifies that the result of get_proc_task() is
      pid_alive() and thus its ->group_leader is fine too.  However this is not
      necessarily true after rcu_read_unlock(), we need to recheck this again
      after first_tid() does rcu_read_lock().  Otherwise
      leader->thread_group.next (used by next_thread()) can be invalid if the
      rcu grace period expires in between.
      
      The race is subtle and unlikely, but still it is possible afaics.  To
      simplify lets ignore the "likely" case when tid != 0, f_version can be
      cleared by proc_task_operations->llseek().
      
      Suppose we have a main thread M and its subthread T.  Suppose that f_pos
      == 3, iow first_tid() should return T.  Now suppose that the following
      happens between rcu_read_unlock() and rcu_read_lock():
      
      	1. T execs and becomes the new leader. This removes M from
      	    ->thread_group but next_thread(M) is still T.
      
      	2. T creates another thread X which does exec as well, T
      	   goes away.
      
      	3. X creates another subthread, this increments nr_threads.
      
      	4. first_tid() does next_thread(M) and returns the already
      	   dead T.
      
      Note also that we need 2.  and 3.  only because of get_nr_threads() check,
      and this check was supposed to be optimization only.
      Signed-off-by: NOleg Nesterov <oleg@redhat.com>
      Cc: "Eric W. Biederman" <ebiederm@xmission.com>
      Cc: Michal Hocko <mhocko@suse.cz>
      Cc: Sameer Nanda <snanda@chromium.org>
      Cc: Sergey Dyasly <dserrg@gmail.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      940fe479
  5. 06 11月, 2013 1 次提交
  6. 29 6月, 2013 5 次提交
  7. 28 5月, 2013 1 次提交
  8. 02 5月, 2013 2 次提交
  9. 01 5月, 2013 1 次提交
  10. 18 4月, 2013 2 次提交
    • P
      posix-timers: Show sigevent info in proc file · 57b8015e
      Pavel Emelyanov 提交于
      Previous patch added proc file to list posix timers created by task.
      Expand the information provided in this file by adding info about
      notification method, with which timers were created. I.e. after
      the "ID:" line there go
      
      1. "signal:" line, that shows signal number and sigval bits;
      2. "notify:" line, that shows the timer notification method.
      
      Thus the timer entry would looke like this:
      
      ID: 123
      signal: 14/0000000000b005d0
      notify: signal/pid.732
      
      This information is enough to understand how timer_create() was called
      for each particular timer.
      Signed-off-by: NPavel Emelyanov <xemul@parallels.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Michael Kerrisk <mtk.manpages@gmail.com>
      Cc: Matthew Helsley <matt.helsley@gmail.com>
      Link: http://lkml.kernel.org/r/513DA024.80404@parallels.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      57b8015e
    • P
      posix-timers: Introduce /proc/PID/timers file · 48f6a7a5
      Pavel Emelyanov 提交于
      Currently kernel doesn't provide any API for getting info about what
      posix timers are configured by processes. It's implied, that a process
      which configured some timers, knows what it did. However, for external
      tools it's impossible to get this information. In particular, this is
      critical for checkpoint-restore project to have this info.
      
      Introduce a per-pid proc file with information about posix
      timers. Since these timers are shared between threads, this file is
      present on tgid level only, no such thing in tid subdirs.
      
      The file format is expected to be the "/proc/<pid>/smaps"-like,
      i.e. each timer will occupy seveal lines to allow for future
      extending.
      
      Each new timer entry starts with the
      
      ID: <number>
      
      line which is added by this patch.
      Signed-off-by: NPavel Emelyanov <xemul@parallels.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Michael Kerrisk <mtk.manpages@gmail.com>
      Cc: Matthew Helsley <matt.helsley@gmail.com>
      Link: http://lkml.kernel.org/r/513DA00D.6070009@parallels.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      48f6a7a5
  11. 10 4月, 2013 1 次提交
  12. 28 2月, 2013 1 次提交
  13. 26 2月, 2013 2 次提交
  14. 23 2月, 2013 1 次提交
  15. 21 12月, 2012 1 次提交
  16. 12 12月, 2012 1 次提交
  17. 11 12月, 2012 1 次提交
  18. 27 11月, 2012 1 次提交
  19. 19 11月, 2012 3 次提交
    • E
      pidns: Make the pidns proc mount/umount logic obvious. · 0a01f2cc
      Eric W. Biederman 提交于
      Track the number of pids in the proc hash table.  When the number of
      pids goes to 0 schedule work to unmount the kernel mount of proc.
      
      Move the mount of proc into alloc_pid when we allocate the pid for
      init.
      
      Remove the surprising calls of pid_ns_release proc in fork and
      proc_flush_task.  Those code paths really shouldn't know about proc
      namespace implementation details and people have demonstrated several
      times that finding and understanding those code paths is difficult and
      non-obvious.
      
      Because of the call path detach pid is alwasy called with the
      rtnl_lock held free_pid is not allowed to sleep, so the work to
      unmounting proc is moved to a work queue.  This has the side benefit
      of not blocking the entire world waiting for the unnecessary
      rcu_barrier in deactivate_locked_super.
      
      In the process of making the code clear and obvious this fixes a bug
      reported by Gao feng <gaofeng@cn.fujitsu.com> where we would leak a
      mount of proc during clone(CLONE_NEWPID|CLONE_NEWNET) if copy_pid_ns
      succeeded and copy_net_ns failed.
      Acked-by: N"Serge E. Hallyn" <serge@hallyn.com>
      Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>
      0a01f2cc
    • E
      procfs: Don't cache a pid in the root inode. · ae06c7c8
      Eric W. Biederman 提交于
      Now that we have s_fs_info pointing to our pid namespace
      the original reason for the proc root inode having a struct
      pid is gone.
      
      Caching a pid in the root inode has led to some complicated
      code.  Now that we don't need the struct pid, just remove it.
      Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
      ae06c7c8
    • E
      procfs: Use the proc generic infrastructure for proc/self. · e656d8a6
      Eric W. Biederman 提交于
      I had visions at one point of splitting proc into two filesystems.  If
      that had happened proc/self being the the part of proc that actually deals
      with pids would have been a nice cleanup.  As it is proc/self requires
      a lot of unnecessary infrastructure for a single file.
      
      The only user visible change is that a mounted /proc for a pid namespace
      that is dead now shows a broken proc symlink, instead of being completely
      invisible.  I don't think anyone will notice or care.
      Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
      e656d8a6
  20. 17 11月, 2012 1 次提交
  21. 30 10月, 2012 1 次提交
    • M
      sched/autogroup: Fix crash on reboot when autogroup is disabled · 5258f386
      Mike Galbraith 提交于
      Due to these two commits:
      
        8323f26c sched: Fix race in task_group()
        800d4d30 sched, autogroup: Stop going ahead if autogroup is disabled
      
      ... autogroup scheduling's dynamic knobs are wrecked.
      
      With both patches applied, all you have to do to crash a box is
      disable autogroup during boot up, then reboot.. boom, NULL pointer
      dereference due to 800d4d30 not allowing autogroup to move things,
      and 8323f26c making that the only way to switch runqueues.
      
      Remove most of the (dysfunctional) knobs and turn the remaining
      sched_autogroup_enabled knob readonly.
      
      If the user fiddles with cgroups hereafter, once tasks
      are moved, autogroup won't mess with them again unless
      they call setsid().
      
      No knobs, no glitz, nada, just a cute little thing folks can
      turn on if they don't want to muck about with cgroups and/or
      systemd.
      Signed-off-by: NMike Galbraith <efault@gmx.de>
      Cc: Xiaotian Feng <xtfeng@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Xiaotian Feng <dannyfeng@tencent.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: <stable@vger.kernel.org> # v3.6
      Link: http://lkml.kernel.org/r/1351451963.4999.8.camel@maggy.simpson.netSigned-off-by: NIngo Molnar <mingo@kernel.org>
      5258f386
  22. 13 10月, 2012 1 次提交
  23. 09 10月, 2012 1 次提交
  24. 27 9月, 2012 3 次提交