1. 10 8月, 2009 1 次提交
    • O
      mm_for_maps: simplify, use ptrace_may_access() · 13f0feaf
      Oleg Nesterov 提交于
      It would be nice to kill __ptrace_may_access(). It requires task_lock(),
      but this lock is only needed to read mm->flags in the middle.
      
      Convert mm_for_maps() to use ptrace_may_access(), this also simplifies
      the code a little bit.
      
      Also, we do not need to take ->mmap_sem in advance. In fact I think
      mm_for_maps() should not play with ->mmap_sem at all, the caller should
      take this lock.
      
      With or without this patch, without ->cred_guard_mutex held we can race
      with exec() and get the new ->mm but check old creds.
      Signed-off-by: NOleg Nesterov <oleg@redhat.com>
      Reviewed-by: NSerge Hallyn <serue@us.ibm.com>
      Signed-off-by: NJames Morris <jmorris@namei.org>
      13f0feaf
  2. 17 6月, 2009 1 次提交
    • D
      oom: move oom_adj value from task_struct to mm_struct · 2ff05b2b
      David Rientjes 提交于
      The per-task oom_adj value is a characteristic of its mm more than the
      task itself since it's not possible to oom kill any thread that shares the
      mm.  If a task were to be killed while attached to an mm that could not be
      freed because another thread were set to OOM_DISABLE, it would have
      needlessly been terminated since there is no potential for future memory
      freeing.
      
      This patch moves oomkilladj (now more appropriately named oom_adj) from
      struct task_struct to struct mm_struct.  This requires task_lock() on a
      task to check its oom_adj value to protect against exec, but it's already
      necessary to take the lock when dereferencing the mm to find the total VM
      size for the badness heuristic.
      
      This fixes a livelock if the oom killer chooses a task and another thread
      sharing the same memory has an oom_adj value of OOM_DISABLE.  This occurs
      because oom_kill_task() repeatedly returns 1 and refuses to kill the
      chosen task while select_bad_process() will repeatedly choose the same
      task during the next retry.
      
      Taking task_lock() in select_bad_process() to check for OOM_DISABLE and in
      oom_kill_task() to check for threads sharing the same memory will be
      removed in the next patch in this series where it will no longer be
      necessary.
      
      Writing to /proc/pid/oom_adj for a kthread will now return -EINVAL since
      these threads are immune from oom killing already.  They simply report an
      oom_adj value of OOM_DISABLE.
      
      Cc: Nick Piggin <npiggin@suse.de>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Mel Gorman <mel@csn.ul.ie>
      Signed-off-by: NDavid Rientjes <rientjes@google.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      2ff05b2b
  3. 29 5月, 2009 1 次提交
  4. 11 5月, 2009 1 次提交
  5. 05 5月, 2009 1 次提交
  6. 17 4月, 2009 1 次提交
  7. 01 4月, 2009 1 次提交
  8. 29 3月, 2009 1 次提交
    • H
      fix setuid sometimes wouldn't · 7c2c7d99
      Hugh Dickins 提交于
      check_unsafe_exec() also notes whether the fs_struct is being
      shared by more threads than will get killed by the exec, and if so
      sets LSM_UNSAFE_SHARE to make bprm_set_creds() careful about euid.
      But /proc/<pid>/cwd and /proc/<pid>/root lookups make transient
      use of get_fs_struct(), which also raises that sharing count.
      
      This might occasionally cause a setuid program not to change euid,
      in the same way as happened with files->count (check_unsafe_exec
      also looks at sighand->count, but /proc doesn't raise that one).
      
      We'd prefer exec not to unshare fs_struct: so fix this in procfs,
      replacing get_fs_struct() by get_fs_path(), which does path_get
      while still holding task_lock, instead of raising fs->count.
      Signed-off-by: NHugh Dickins <hugh@veritas.com>
      Cc: stable@kernel.org
      ___
      
       fs/proc/base.c |   50 +++++++++++++++--------------------------------
       1 file changed, 16 insertions(+), 34 deletions(-)
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      7c2c7d99
  9. 28 3月, 2009 1 次提交
  10. 18 3月, 2009 1 次提交
    • L
      Avoid 64-bit "switch()" statements on 32-bit architectures · ee568b25
      Linus Torvalds 提交于
      Commit ee6f779b ("filp->f_pos not
      correctly updated in proc_task_readdir") changed the proc code to use
      filp->f_pos directly, rather than through a temporary variable.  In the
      process, that caused the operations to be done on the full 64 bits, even
      though the offset is never that big.
      
      That's all fine and dandy per se, but for some unfathomable reason gcc
      generates absolutely horrid code when using 64-bit values in switch()
      statements.  To the point of actually calling out to gcc helper
      functions like __cmpdi2 rather than just doing the trivial comparisons
      directly the way gcc does for normal compares.  At which point we get
      link failures, because we really don't want to support that kind of
      crazy code.
      
      Fix this by just casting the f_pos value to "unsigned long", which
      is plenty big enough for /proc, and avoids the gcc code generation issue.
      Reported-by: NAlexey Dobriyan <adobriyan@gmail.com>
      Cc: Zhang Le <r0bertz@gentoo.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      ee568b25
  11. 16 3月, 2009 1 次提交
    • Z
      filp->f_pos not correctly updated in proc_task_readdir · ee6f779b
      Zhang Le 提交于
      filp->f_pos only get updated at the end of the function. Thus d_off of those
      dirents who are in the middle will be 0, and this will cause a problem in
      glibc's readdir implementation, specifically endless loop. Because when overflow
      occurs, f_pos will be set to next dirent to read, however it will be 0, unless
      the next one is the last one. So it will start over again and again.
      
      There is a sample program in man 2 gendents. This is the output of the program
      running on a multithread program's task dir before this patch is applied:
      
        $ ./a.out /proc/3807/task
        --------------- nread=128 ---------------
        i-node#  file type  d_reclen  d_off   d_name
          506442  directory    16          1  .
          506441  directory    16          0  ..
          506443  directory    16          0  3807
          506444  directory    16          0  3809
          506445  directory    16          0  3812
          506446  directory    16          0  3861
          506447  directory    16          0  3862
          506448  directory    16          8  3863
      
      This is the output after this patch is applied
      
        $ ./a.out /proc/3807/task
        --------------- nread=128 ---------------
        i-node#  file type  d_reclen  d_off   d_name
          506442  directory    16          1  .
          506441  directory    16          2  ..
          506443  directory    16          3  3807
          506444  directory    16          4  3809
          506445  directory    16          5  3812
          506446  directory    16          6  3861
          506447  directory    16          7  3862
          506448  directory    16          8  3863
      Signed-off-by: NZhang Le <r0bertz@gentoo.org>
      Acked-by: NAl Viro <viro@ZenIV.linux.org.uk>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      ee6f779b
  12. 06 1月, 2009 1 次提交
  13. 05 1月, 2009 5 次提交
  14. 22 12月, 2008 1 次提交
    • I
      sched: fix warning in fs/proc/base.c · 826e08b0
      Ingo Molnar 提交于
      Stephen Rothwell reported this new (harmless) build warning on platforms that
      define u64 to long:
      
       fs/proc/base.c: In function 'proc_pid_schedstat':
       fs/proc/base.c:352: warning: format '%llu' expects type 'long long unsigned int', but argument 3 has type 'u64'
      
      asm-generic/int-l64.h platforms strike again: that file should be eliminated.
      
      Fix it by casting the parameters to long long.
      Reported-by: NStephen Rothwell <sfr@canb.auug.org.au>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      826e08b0
  15. 18 12月, 2008 1 次提交
  16. 11 12月, 2008 1 次提交
  17. 14 11月, 2008 2 次提交
  18. 21 10月, 2008 1 次提交
  19. 10 10月, 2008 3 次提交
  20. 06 8月, 2008 1 次提交
    • A
      proc: fix warnings · 7c44319d
      Alexander Beregalov 提交于
      proc: fix warnings
      
       fs/proc/base.c:2429: warning: format '%llu' expects type 'long long unsigned int', but argument 3 has type 'u64'
       fs/proc/base.c:2429: warning: format '%llu' expects type 'long long unsigned int', but argument 4 has type 'u64'
       fs/proc/base.c:2429: warning: format '%llu' expects type 'long long unsigned int', but argument 5 has type 'u64'
       fs/proc/base.c:2429: warning: format '%llu' expects type 'long long unsigned int', but argument 6 has type 'u64'
       fs/proc/base.c:2429: warning: format '%llu' expects type 'long long unsigned int', but argument 7 has type 'u64'
       fs/proc/base.c:2429: warning: format '%llu' expects type 'long long unsigned int', but argument 8 has type 'u64'
       fs/proc/base.c:2429: warning: format '%llu' expects type 'long long unsigned int', but argument 9 has type 'u64'
      Signed-off-by: NAlexander Beregalov <a.beregalov@gmail.com>
      Acked-by: NAndrea Righi <righi.andrea@gmail.com>
      Cc: Oleg Nesterov <oleg@tv-sign.ru>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      7c44319d
  21. 28 7月, 2008 2 次提交
  22. 27 7月, 2008 4 次提交
  23. 26 7月, 2008 1 次提交
  24. 14 7月, 2008 1 次提交
    • S
      Security: split proc ptrace checking into read vs. attach · 006ebb40
      Stephen Smalley 提交于
      Enable security modules to distinguish reading of process state via
      proc from full ptrace access by renaming ptrace_may_attach to
      ptrace_may_access and adding a mode argument indicating whether only
      read access or full attach access is requested.  This allows security
      modules to permit access to reading process state without granting
      full ptrace access.  The base DAC/capability checking remains unchanged.
      
      Read access to /proc/pid/mem continues to apply a full ptrace attach
      check since check_mem_permission() already requires the current task
      to already be ptracing the target.  The other ptrace checks within
      proc for elements like environ, maps, and fds are changed to pass the
      read mode instead of attach.
      
      In the SELinux case, we model such reading of process state as a
      reading of a proc file labeled with the target process' label.  This
      enables SELinux policy to permit such reading of process state without
      permitting control or manipulation of the target process, as there are
      a number of cases where programs probe for such information via proc
      but do not need to be able to control the target (e.g. procps,
      lsof, PolicyKit, ConsoleKit).  At present we have to choose between
      allowing full ptrace in policy (more permissive than required/desired)
      or breaking functionality (or in some cases just silencing the denials
      via dontaudit rules but this can hide genuine attacks).
      
      This version of the patch incorporates comments from Casey Schaufler
      (change/replace existing ptrace_may_attach interface, pass access
      mode), and Chris Wright (provide greater consistency in the checking).
      
      Note that like their predecessors __ptrace_may_attach and
      ptrace_may_attach, the __ptrace_may_access and ptrace_may_access
      interfaces use different return value conventions from each other (0
      or -errno vs. 1 or 0).  I retained this difference to avoid any
      changes to the caller logic but made the difference clearer by
      changing the latter interface to return a bool rather than an int and
      by adding a comment about it to ptrace.h for any future callers.
      Signed-off-by: NStephen Smalley <sds@tycho.nsa.gov>
      Acked-by: NChris Wright <chrisw@sous-sol.org>
      Signed-off-by: NJames Morris <jmorris@namei.org>
      006ebb40
  25. 07 6月, 2008 1 次提交
  26. 17 5月, 2008 1 次提交
  27. 02 5月, 2008 1 次提交
  28. 29 4月, 2008 2 次提交
    • R
      procfs: mem permission cleanup · 638fa202
      Roland McGrath 提交于
      This cleans up the permission checks done for /proc/PID/mem i/o calls.  It
      puts all the logic in a new function, check_mem_permission().
      
      The old code repeated the (!MAY_PTRACE(task) || !ptrace_may_attach(task))
      magical expression multiple times.  The new function does all that work in one
      place, with clear comments.
      
      The old code called security_ptrace() twice on successful checks, once in
      MAY_PTRACE() and once in __ptrace_may_attach().  Now it's only called once,
      and only if all other checks have succeeded.
      Signed-off-by: NRoland McGrath <roland@redhat.com>
      Cc: Alexey Dobriyan <adobriyan@gmail.com>
      Cc: Oleg Nesterov <oleg@tv-sign.ru>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      638fa202
    • M
      procfs task exe symlink · 925d1c40
      Matt Helsley 提交于
      The kernel implements readlink of /proc/pid/exe by getting the file from
      the first executable VMA.  Then the path to the file is reconstructed and
      reported as the result.
      
      Because of the VMA walk the code is slightly different on nommu systems.
      This patch avoids separate /proc/pid/exe code on nommu systems.  Instead of
      walking the VMAs to find the first executable file-backed VMA we store a
      reference to the exec'd file in the mm_struct.
      
      That reference would prevent the filesystem holding the executable file
      from being unmounted even after unmapping the VMAs.  So we track the number
      of VM_EXECUTABLE VMAs and drop the new reference when the last one is
      unmapped.  This avoids pinning the mounted filesystem.
      
      [akpm@linux-foundation.org: improve comments]
      [yamamoto@valinux.co.jp: fix dup_mmap]
      Signed-off-by: NMatt Helsley <matthltc@us.ibm.com>
      Cc: Oleg Nesterov <oleg@tv-sign.ru>
      Cc: David Howells <dhowells@redhat.com>
      Cc:"Eric W. Biederman" <ebiederm@xmission.com>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Hugh Dickins <hugh@veritas.com>
      Signed-off-by: NYAMAMOTO Takashi <yamamoto@valinux.co.jp>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      925d1c40