1. 11 2月, 2011 1 次提交
  2. 03 4月, 2010 1 次提交
  3. 10 12月, 2009 1 次提交
  4. 24 11月, 2009 2 次提交
    • S
      remove CONFIG_SECURITY_FILE_CAPABILITIES compile option · b3a222e5
      Serge E. Hallyn 提交于
      As far as I know, all distros currently ship kernels with default
      CONFIG_SECURITY_FILE_CAPABILITIES=y.  Since having the option on
      leaves a 'no_file_caps' option to boot without file capabilities,
      the main reason to keep the option is that turning it off saves
      you (on my s390x partition) 5k.  In particular, vmlinux sizes
      came to:
      
      without patch fscaps=n:		 	53598392
      without patch fscaps=y:		 	53603406
      with this patch applied:		53603342
      
      with the security-next tree.
      
      Against this we must weigh the fact that there is no simple way for
      userspace to figure out whether file capabilities are supported,
      while things like per-process securebits, capability bounding
      sets, and adding bits to pI if CAP_SETPCAP is in pE are not supported
      with SECURITY_FILE_CAPABILITIES=n, leaving a bit of a problem for
      applications wanting to know whether they can use them and/or why
      something failed.
      
      It also adds another subtly different set of semantics which we must
      maintain at the risk of severe security regressions.
      
      So this patch removes the SECURITY_FILE_CAPABILITIES compile
      option.  It drops the kernel size by about 50k over the stock
      SECURITY_FILE_CAPABILITIES=y kernel, by removing the
      cap_limit_ptraced_target() function.
      
      Changelog:
      	Nov 20: remove cap_limit_ptraced_target() as it's logic
      		was ifndef'ed.
      Signed-off-by: NSerge E. Hallyn <serue@us.ibm.com>
      Acked-by: NAndrew G. Morgan" <morgan@kernel.org>
      Signed-off-by: NJames Morris <jmorris@namei.org>
      b3a222e5
    • A
      Silence the existing API for capability version compatibility check. · c4a5af54
      Andrew G. Morgan 提交于
      When libcap, or other libraries attempt to confirm/determine the supported
      capability version magic, they generally supply a NULL dataptr to capget().
      
      In this case, while returning the supported/preferred magic (via a
      modified header content), the return code of this system call may be 0,
      -EINVAL, or -EFAULT.
      
      No libcap code depends on the previous -EINVAL etc. return code, and
      all of the above three return codes can accompany a valid (successful)
      attempt to determine the requested magic value.
      
      This patch cleans up the system call to return 0, if the call is
      successfully being used to determine the supported/preferred capability
      magic value.
      Signed-off-by: NAndrew G. Morgan <morgan@kernel.org>
      Acked-by: NSteve Grubb <sgrubb@redhat.com>
      Acked-by: NSerge Hallyn <serue@us.ibm.com>
      Signed-off-by: NJames Morris <jmorris@namei.org>
      c4a5af54
  5. 14 10月, 2009 1 次提交
  6. 14 1月, 2009 1 次提交
  7. 07 1月, 2009 2 次提交
    • D
      CRED: Fix regression in cap_capable() as shown up by sys_faccessat() [ver #3] · 3699c53c
      David Howells 提交于
      Fix a regression in cap_capable() due to:
      
      	commit 3b11a1de
      	Author: David Howells <dhowells@redhat.com>
      	Date:   Fri Nov 14 10:39:26 2008 +1100
      
      	    CRED: Differentiate objective and effective subjective credentials on a task
      
      The problem is that the above patch allows a process to have two sets of
      credentials, and for the most part uses the subjective credentials when
      accessing current's creds.
      
      There is, however, one exception: cap_capable(), and thus capable(), uses the
      real/objective credentials of the target task, whether or not it is the current
      task.
      
      Ordinarily this doesn't matter, since usually the two cred pointers in current
      point to the same set of creds.  However, sys_faccessat() makes use of this
      facility to override the credentials of the calling process to make its test,
      without affecting the creds as seen from other processes.
      
      One of the things sys_faccessat() does is to make an adjustment to the
      effective capabilities mask, which cap_capable(), as it stands, then ignores.
      
      The affected capability check is in generic_permission():
      
      	if (!(mask & MAY_EXEC) || execute_ok(inode))
      		if (capable(CAP_DAC_OVERRIDE))
      			return 0;
      
      This change passes the set of credentials to be tested down into the commoncap
      and SELinux code.  The security functions called by capable() and
      has_capability() select the appropriate set of credentials from the process
      being checked.
      
      This can be tested by compiling the following program from the XFS testsuite:
      
      /*
       *  t_access_root.c - trivial test program to show permission bug.
       *
       *  Written by Michael Kerrisk - copyright ownership not pursued.
       *  Sourced from: http://linux.derkeiler.com/Mailing-Lists/Kernel/2003-10/6030.html
       */
      #include <limits.h>
      #include <unistd.h>
      #include <stdio.h>
      #include <stdlib.h>
      #include <fcntl.h>
      #include <sys/stat.h>
      
      #define UID 500
      #define GID 100
      #define PERM 0
      #define TESTPATH "/tmp/t_access"
      
      static void
      errExit(char *msg)
      {
          perror(msg);
          exit(EXIT_FAILURE);
      } /* errExit */
      
      static void
      accessTest(char *file, int mask, char *mstr)
      {
          printf("access(%s, %s) returns %d\n", file, mstr, access(file, mask));
      } /* accessTest */
      
      int
      main(int argc, char *argv[])
      {
          int fd, perm, uid, gid;
          char *testpath;
          char cmd[PATH_MAX + 20];
      
          testpath = (argc > 1) ? argv[1] : TESTPATH;
          perm = (argc > 2) ? strtoul(argv[2], NULL, 8) : PERM;
          uid = (argc > 3) ? atoi(argv[3]) : UID;
          gid = (argc > 4) ? atoi(argv[4]) : GID;
      
          unlink(testpath);
      
          fd = open(testpath, O_RDWR | O_CREAT, 0);
          if (fd == -1) errExit("open");
      
          if (fchown(fd, uid, gid) == -1) errExit("fchown");
          if (fchmod(fd, perm) == -1) errExit("fchmod");
          close(fd);
      
          snprintf(cmd, sizeof(cmd), "ls -l %s", testpath);
          system(cmd);
      
          if (seteuid(uid) == -1) errExit("seteuid");
      
          accessTest(testpath, 0, "0");
          accessTest(testpath, R_OK, "R_OK");
          accessTest(testpath, W_OK, "W_OK");
          accessTest(testpath, X_OK, "X_OK");
          accessTest(testpath, R_OK | W_OK, "R_OK | W_OK");
          accessTest(testpath, R_OK | X_OK, "R_OK | X_OK");
          accessTest(testpath, W_OK | X_OK, "W_OK | X_OK");
          accessTest(testpath, R_OK | W_OK | X_OK, "R_OK | W_OK | X_OK");
      
          exit(EXIT_SUCCESS);
      } /* main */
      
      This can be run against an Ext3 filesystem as well as against an XFS
      filesystem.  If successful, it will show:
      
      	[root@andromeda src]# ./t_access_root /tmp/xxx 0 4043 4043
      	---------- 1 dhowells dhowells 0 2008-12-31 03:00 /tmp/xxx
      	access(/tmp/xxx, 0) returns 0
      	access(/tmp/xxx, R_OK) returns 0
      	access(/tmp/xxx, W_OK) returns 0
      	access(/tmp/xxx, X_OK) returns -1
      	access(/tmp/xxx, R_OK | W_OK) returns 0
      	access(/tmp/xxx, R_OK | X_OK) returns -1
      	access(/tmp/xxx, W_OK | X_OK) returns -1
      	access(/tmp/xxx, R_OK | W_OK | X_OK) returns -1
      
      If unsuccessful, it will show:
      
      	[root@andromeda src]# ./t_access_root /tmp/xxx 0 4043 4043
      	---------- 1 dhowells dhowells 0 2008-12-31 02:56 /tmp/xxx
      	access(/tmp/xxx, 0) returns 0
      	access(/tmp/xxx, R_OK) returns -1
      	access(/tmp/xxx, W_OK) returns -1
      	access(/tmp/xxx, X_OK) returns -1
      	access(/tmp/xxx, R_OK | W_OK) returns -1
      	access(/tmp/xxx, R_OK | X_OK) returns -1
      	access(/tmp/xxx, W_OK | X_OK) returns -1
      	access(/tmp/xxx, R_OK | W_OK | X_OK) returns -1
      
      I've also tested the fix with the SELinux and syscalls LTP testsuites.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Tested-by: NJ. Bruce Fields <bfields@citi.umich.edu>
      Acked-by: NSerge Hallyn <serue@us.ibm.com>
      Signed-off-by: NJames Morris <jmorris@namei.org>
      3699c53c
    • J
      Revert "CRED: Fix regression in cap_capable() as shown up by sys_faccessat() [ver #2]" · 29881c45
      James Morris 提交于
      This reverts commit 14eaddc9.
      
      David has a better version to come.
      29881c45
  8. 05 1月, 2009 2 次提交
    • D
      CRED: Fix regression in cap_capable() as shown up by sys_faccessat() [ver #2] · 14eaddc9
      David Howells 提交于
      Fix a regression in cap_capable() due to:
      
      	commit 5ff7711e635b32f0a1e558227d030c7e45b4a465
      	Author: David Howells <dhowells@redhat.com>
      	Date:   Wed Dec 31 02:52:28 2008 +0000
      
      	    CRED: Differentiate objective and effective subjective credentials on a task
      
      The problem is that the above patch allows a process to have two sets of
      credentials, and for the most part uses the subjective credentials when
      accessing current's creds.
      
      There is, however, one exception: cap_capable(), and thus capable(), uses the
      real/objective credentials of the target task, whether or not it is the current
      task.
      
      Ordinarily this doesn't matter, since usually the two cred pointers in current
      point to the same set of creds.  However, sys_faccessat() makes use of this
      facility to override the credentials of the calling process to make its test,
      without affecting the creds as seen from other processes.
      
      One of the things sys_faccessat() does is to make an adjustment to the
      effective capabilities mask, which cap_capable(), as it stands, then ignores.
      
      The affected capability check is in generic_permission():
      
      	if (!(mask & MAY_EXEC) || execute_ok(inode))
      		if (capable(CAP_DAC_OVERRIDE))
      			return 0;
      
      This change splits capable() from has_capability() down into the commoncap and
      SELinux code.  The capable() security op now only deals with the current
      process, and uses the current process's subjective creds.  A new security op -
      task_capable() - is introduced that can check any task's objective creds.
      
      strictly the capable() security op is superfluous with the presence of the
      task_capable() op, however it should be faster to call the capable() op since
      two fewer arguments need be passed down through the various layers.
      
      This can be tested by compiling the following program from the XFS testsuite:
      
      /*
       *  t_access_root.c - trivial test program to show permission bug.
       *
       *  Written by Michael Kerrisk - copyright ownership not pursued.
       *  Sourced from: http://linux.derkeiler.com/Mailing-Lists/Kernel/2003-10/6030.html
       */
      #include <limits.h>
      #include <unistd.h>
      #include <stdio.h>
      #include <stdlib.h>
      #include <fcntl.h>
      #include <sys/stat.h>
      
      #define UID 500
      #define GID 100
      #define PERM 0
      #define TESTPATH "/tmp/t_access"
      
      static void
      errExit(char *msg)
      {
          perror(msg);
          exit(EXIT_FAILURE);
      } /* errExit */
      
      static void
      accessTest(char *file, int mask, char *mstr)
      {
          printf("access(%s, %s) returns %d\n", file, mstr, access(file, mask));
      } /* accessTest */
      
      int
      main(int argc, char *argv[])
      {
          int fd, perm, uid, gid;
          char *testpath;
          char cmd[PATH_MAX + 20];
      
          testpath = (argc > 1) ? argv[1] : TESTPATH;
          perm = (argc > 2) ? strtoul(argv[2], NULL, 8) : PERM;
          uid = (argc > 3) ? atoi(argv[3]) : UID;
          gid = (argc > 4) ? atoi(argv[4]) : GID;
      
          unlink(testpath);
      
          fd = open(testpath, O_RDWR | O_CREAT, 0);
          if (fd == -1) errExit("open");
      
          if (fchown(fd, uid, gid) == -1) errExit("fchown");
          if (fchmod(fd, perm) == -1) errExit("fchmod");
          close(fd);
      
          snprintf(cmd, sizeof(cmd), "ls -l %s", testpath);
          system(cmd);
      
          if (seteuid(uid) == -1) errExit("seteuid");
      
          accessTest(testpath, 0, "0");
          accessTest(testpath, R_OK, "R_OK");
          accessTest(testpath, W_OK, "W_OK");
          accessTest(testpath, X_OK, "X_OK");
          accessTest(testpath, R_OK | W_OK, "R_OK | W_OK");
          accessTest(testpath, R_OK | X_OK, "R_OK | X_OK");
          accessTest(testpath, W_OK | X_OK, "W_OK | X_OK");
          accessTest(testpath, R_OK | W_OK | X_OK, "R_OK | W_OK | X_OK");
      
          exit(EXIT_SUCCESS);
      } /* main */
      
      This can be run against an Ext3 filesystem as well as against an XFS
      filesystem.  If successful, it will show:
      
      	[root@andromeda src]# ./t_access_root /tmp/xxx 0 4043 4043
      	---------- 1 dhowells dhowells 0 2008-12-31 03:00 /tmp/xxx
      	access(/tmp/xxx, 0) returns 0
      	access(/tmp/xxx, R_OK) returns 0
      	access(/tmp/xxx, W_OK) returns 0
      	access(/tmp/xxx, X_OK) returns -1
      	access(/tmp/xxx, R_OK | W_OK) returns 0
      	access(/tmp/xxx, R_OK | X_OK) returns -1
      	access(/tmp/xxx, W_OK | X_OK) returns -1
      	access(/tmp/xxx, R_OK | W_OK | X_OK) returns -1
      
      If unsuccessful, it will show:
      
      	[root@andromeda src]# ./t_access_root /tmp/xxx 0 4043 4043
      	---------- 1 dhowells dhowells 0 2008-12-31 02:56 /tmp/xxx
      	access(/tmp/xxx, 0) returns 0
      	access(/tmp/xxx, R_OK) returns -1
      	access(/tmp/xxx, W_OK) returns -1
      	access(/tmp/xxx, X_OK) returns -1
      	access(/tmp/xxx, R_OK | W_OK) returns -1
      	access(/tmp/xxx, R_OK | X_OK) returns -1
      	access(/tmp/xxx, W_OK | X_OK) returns -1
      	access(/tmp/xxx, R_OK | W_OK | X_OK) returns -1
      
      I've also tested the fix with the SELinux and syscalls LTP testsuites.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Signed-off-by: NJames Morris <jmorris@namei.org>
      14eaddc9
    • A
      sanitize audit_log_capset() · 57f71a0a
      Al Viro 提交于
      * no allocations
      * return void
      * don't duplicate checked for dummy context
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      57f71a0a
  9. 14 11月, 2008 3 次提交
    • D
      CRED: Inaugurate COW credentials · d84f4f99
      David Howells 提交于
      Inaugurate copy-on-write credentials management.  This uses RCU to manage the
      credentials pointer in the task_struct with respect to accesses by other tasks.
      A process may only modify its own credentials, and so does not need locking to
      access or modify its own credentials.
      
      A mutex (cred_replace_mutex) is added to the task_struct to control the effect
      of PTRACE_ATTACHED on credential calculations, particularly with respect to
      execve().
      
      With this patch, the contents of an active credentials struct may not be
      changed directly; rather a new set of credentials must be prepared, modified
      and committed using something like the following sequence of events:
      
      	struct cred *new = prepare_creds();
      	int ret = blah(new);
      	if (ret < 0) {
      		abort_creds(new);
      		return ret;
      	}
      	return commit_creds(new);
      
      There are some exceptions to this rule: the keyrings pointed to by the active
      credentials may be instantiated - keyrings violate the COW rule as managing
      COW keyrings is tricky, given that it is possible for a task to directly alter
      the keys in a keyring in use by another task.
      
      To help enforce this, various pointers to sets of credentials, such as those in
      the task_struct, are declared const.  The purpose of this is compile-time
      discouragement of altering credentials through those pointers.  Once a set of
      credentials has been made public through one of these pointers, it may not be
      modified, except under special circumstances:
      
        (1) Its reference count may incremented and decremented.
      
        (2) The keyrings to which it points may be modified, but not replaced.
      
      The only safe way to modify anything else is to create a replacement and commit
      using the functions described in Documentation/credentials.txt (which will be
      added by a later patch).
      
      This patch and the preceding patches have been tested with the LTP SELinux
      testsuite.
      
      This patch makes several logical sets of alteration:
      
       (1) execve().
      
           This now prepares and commits credentials in various places in the
           security code rather than altering the current creds directly.
      
       (2) Temporary credential overrides.
      
           do_coredump() and sys_faccessat() now prepare their own credentials and
           temporarily override the ones currently on the acting thread, whilst
           preventing interference from other threads by holding cred_replace_mutex
           on the thread being dumped.
      
           This will be replaced in a future patch by something that hands down the
           credentials directly to the functions being called, rather than altering
           the task's objective credentials.
      
       (3) LSM interface.
      
           A number of functions have been changed, added or removed:
      
           (*) security_capset_check(), ->capset_check()
           (*) security_capset_set(), ->capset_set()
      
           	 Removed in favour of security_capset().
      
           (*) security_capset(), ->capset()
      
           	 New.  This is passed a pointer to the new creds, a pointer to the old
           	 creds and the proposed capability sets.  It should fill in the new
           	 creds or return an error.  All pointers, barring the pointer to the
           	 new creds, are now const.
      
           (*) security_bprm_apply_creds(), ->bprm_apply_creds()
      
           	 Changed; now returns a value, which will cause the process to be
           	 killed if it's an error.
      
           (*) security_task_alloc(), ->task_alloc_security()
      
           	 Removed in favour of security_prepare_creds().
      
           (*) security_cred_free(), ->cred_free()
      
           	 New.  Free security data attached to cred->security.
      
           (*) security_prepare_creds(), ->cred_prepare()
      
           	 New. Duplicate any security data attached to cred->security.
      
           (*) security_commit_creds(), ->cred_commit()
      
           	 New. Apply any security effects for the upcoming installation of new
           	 security by commit_creds().
      
           (*) security_task_post_setuid(), ->task_post_setuid()
      
           	 Removed in favour of security_task_fix_setuid().
      
           (*) security_task_fix_setuid(), ->task_fix_setuid()
      
           	 Fix up the proposed new credentials for setuid().  This is used by
           	 cap_set_fix_setuid() to implicitly adjust capabilities in line with
           	 setuid() changes.  Changes are made to the new credentials, rather
           	 than the task itself as in security_task_post_setuid().
      
           (*) security_task_reparent_to_init(), ->task_reparent_to_init()
      
           	 Removed.  Instead the task being reparented to init is referred
           	 directly to init's credentials.
      
      	 NOTE!  This results in the loss of some state: SELinux's osid no
      	 longer records the sid of the thread that forked it.
      
           (*) security_key_alloc(), ->key_alloc()
           (*) security_key_permission(), ->key_permission()
      
           	 Changed.  These now take cred pointers rather than task pointers to
           	 refer to the security context.
      
       (4) sys_capset().
      
           This has been simplified and uses less locking.  The LSM functions it
           calls have been merged.
      
       (5) reparent_to_kthreadd().
      
           This gives the current thread the same credentials as init by simply using
           commit_thread() to point that way.
      
       (6) __sigqueue_alloc() and switch_uid()
      
           __sigqueue_alloc() can't stop the target task from changing its creds
           beneath it, so this function gets a reference to the currently applicable
           user_struct which it then passes into the sigqueue struct it returns if
           successful.
      
           switch_uid() is now called from commit_creds(), and possibly should be
           folded into that.  commit_creds() should take care of protecting
           __sigqueue_alloc().
      
       (7) [sg]et[ug]id() and co and [sg]et_current_groups.
      
           The set functions now all use prepare_creds(), commit_creds() and
           abort_creds() to build and check a new set of credentials before applying
           it.
      
           security_task_set[ug]id() is called inside the prepared section.  This
           guarantees that nothing else will affect the creds until we've finished.
      
           The calling of set_dumpable() has been moved into commit_creds().
      
           Much of the functionality of set_user() has been moved into
           commit_creds().
      
           The get functions all simply access the data directly.
      
       (8) security_task_prctl() and cap_task_prctl().
      
           security_task_prctl() has been modified to return -ENOSYS if it doesn't
           want to handle a function, or otherwise return the return value directly
           rather than through an argument.
      
           Additionally, cap_task_prctl() now prepares a new set of credentials, even
           if it doesn't end up using it.
      
       (9) Keyrings.
      
           A number of changes have been made to the keyrings code:
      
           (a) switch_uid_keyring(), copy_keys(), exit_keys() and suid_keys() have
           	 all been dropped and built in to the credentials functions directly.
           	 They may want separating out again later.
      
           (b) key_alloc() and search_process_keyrings() now take a cred pointer
           	 rather than a task pointer to specify the security context.
      
           (c) copy_creds() gives a new thread within the same thread group a new
           	 thread keyring if its parent had one, otherwise it discards the thread
           	 keyring.
      
           (d) The authorisation key now points directly to the credentials to extend
           	 the search into rather pointing to the task that carries them.
      
           (e) Installing thread, process or session keyrings causes a new set of
           	 credentials to be created, even though it's not strictly necessary for
           	 process or session keyrings (they're shared).
      
      (10) Usermode helper.
      
           The usermode helper code now carries a cred struct pointer in its
           subprocess_info struct instead of a new session keyring pointer.  This set
           of credentials is derived from init_cred and installed on the new process
           after it has been cloned.
      
           call_usermodehelper_setup() allocates the new credentials and
           call_usermodehelper_freeinfo() discards them if they haven't been used.  A
           special cred function (prepare_usermodeinfo_creds()) is provided
           specifically for call_usermodehelper_setup() to call.
      
           call_usermodehelper_setkeys() adjusts the credentials to sport the
           supplied keyring as the new session keyring.
      
      (11) SELinux.
      
           SELinux has a number of changes, in addition to those to support the LSM
           interface changes mentioned above:
      
           (a) selinux_setprocattr() no longer does its check for whether the
           	 current ptracer can access processes with the new SID inside the lock
           	 that covers getting the ptracer's SID.  Whilst this lock ensures that
           	 the check is done with the ptracer pinned, the result is only valid
           	 until the lock is released, so there's no point doing it inside the
           	 lock.
      
      (12) is_single_threaded().
      
           This function has been extracted from selinux_setprocattr() and put into
           a file of its own in the lib/ directory as join_session_keyring() now
           wants to use it too.
      
           The code in SELinux just checked to see whether a task shared mm_structs
           with other tasks (CLONE_VM), but that isn't good enough.  We really want
           to know if they're part of the same thread group (CLONE_THREAD).
      
      (13) nfsd.
      
           The NFS server daemon now has to use the COW credentials to set the
           credentials it is going to use.  It really needs to pass the credentials
           down to the functions it calls, but it can't do that until other patches
           in this series have been applied.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Acked-by: NJames Morris <jmorris@namei.org>
      Signed-off-by: NJames Morris <jmorris@namei.org>
      d84f4f99
    • D
      CRED: Separate task security context from task_struct · b6dff3ec
      David Howells 提交于
      Separate the task security context from task_struct.  At this point, the
      security data is temporarily embedded in the task_struct with two pointers
      pointing to it.
      
      Note that the Alpha arch is altered as it refers to (E)UID and (E)GID in
      entry.S via asm-offsets.
      
      With comment fixes Signed-off-by: Marc Dionne <marc.c.dionne@gmail.com>
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Acked-by: NJames Morris <jmorris@namei.org>
      Acked-by: NSerge Hallyn <serue@us.ibm.com>
      Signed-off-by: NJames Morris <jmorris@namei.org>
      b6dff3ec
    • D
      CRED: Neuter sys_capset() · 1cdcbec1
      David Howells 提交于
      Take away the ability for sys_capset() to affect processes other than current.
      
      This means that current will not need to lock its own credentials when reading
      them against interference by other processes.
      
      This has effectively been the case for a while anyway, since:
      
       (1) Without LSM enabled, sys_capset() is disallowed.
      
       (2) With file-based capabilities, sys_capset() is neutered.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Acked-by: NSerge Hallyn <serue@us.ibm.com>
      Acked-by: NAndrew G. Morgan <morgan@kernel.org>
      Acked-by: NJames Morris <jmorris@namei.org>
      Signed-off-by: NJames Morris <jmorris@namei.org>
      1cdcbec1
  10. 11 11月, 2008 2 次提交
    • E
      Capabilities: BUG when an invalid capability is requested · 637d32dc
      Eric Paris 提交于
      If an invalid (large) capability is requested the capabilities system
      may panic as it is dereferencing an array of fixed (short) length.  Its
      possible (and actually often happens) that the capability system
      accidentally stumbled into a valid memory region but it also regularly
      happens that it hits invalid memory and BUGs.  If such an operation does
      get past cap_capable then the selinux system is sure to have problems as
      it already does a (simple) validity check and BUG.  This is known to
      happen by the broken and buggy firegl driver.
      
      This patch cleanly checks all capable calls and BUG if a call is for an
      invalid capability.  This will likely break the firegl driver for some
      situations, but it is the right thing to do.  Garbage into a security
      system gets you killed/bugged
      Signed-off-by: NEric Paris <eparis@redhat.com>
      Acked-by: NArjan van de Ven <arjan@linux.intel.com>
      Acked-by: NSerge Hallyn <serue@us.ibm.com>
      Acked-by: NAndrew G. Morgan <morgan@kernel.org>
      Signed-off-by: NJames Morris <jmorris@namei.org>
      637d32dc
    • E
      When the capset syscall is used it is not possible for audit to record the · e68b75a0
      Eric Paris 提交于
      actual capbilities being added/removed.  This patch adds a new record type
      which emits the target pid and the eff, inh, and perm cap sets.
      
      example output if you audit capset syscalls would be:
      
      type=SYSCALL msg=audit(1225743140.465:76): arch=c000003e syscall=126 success=yes exit=0 a0=17f2014 a1=17f201c a2=80000000 a3=7fff2ab7f060 items=0 ppid=2160 pid=2223 auid=0 uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=0 fsgid=0 tty=pts0 ses=1 comm="setcap" exe="/usr/sbin/setcap" subj=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023 key=(null)
      type=UNKNOWN[1322] msg=audit(1225743140.465:76): pid=0 cap_pi=ffffffffffffffff cap_pp=ffffffffffffffff cap_pe=ffffffffffffffff
      Signed-off-by: NEric Paris <eparis@redhat.com>
      Acked-by: NSerge Hallyn <serue@us.ibm.com>
      Signed-off-by: NJames Morris <jmorris@namei.org>
      e68b75a0
  11. 06 11月, 2008 1 次提交
    • S
      file capabilities: add no_file_caps switch (v4) · 1f29fae2
      Serge E. Hallyn 提交于
      Add a no_file_caps boot option when file capabilities are
      compiled into the kernel (CONFIG_SECURITY_FILE_CAPABILITIES=y).
      
      This allows distributions to ship a kernel with file capabilities
      compiled in, without forcing users to use (and understand and
      trust) them.
      
      When no_file_caps is specified at boot, then when a process executes
      a file, any file capabilities stored with that file will not be
      used in the calculation of the process' new capability sets.
      
      This means that booting with the no_file_caps boot option will
      not be the same as booting a kernel with file capabilities
      compiled out - in particular a task with  CAP_SETPCAP will not
      have any chance of passing capabilities to another task (which
      isn't "really" possible anyway, and which may soon by killed
      altogether by David Howells in any case), and it will instead
      be able to put new capabilities in its pI.  However since fI
      will always be empty and pI is masked with fI, it gains the
      task nothing.
      
      We also support the extra prctl options, setting securebits and
      dropping capabilities from the per-process bounding set.
      
      The other remaining difference is that killpriv, task_setscheduler,
      setioprio, and setnice will continue to be hooked.  That will
      be noticable in the case where a root task changed its uid
      while keeping some caps, and another task owned by the new uid
      tries to change settings for the more privileged task.
      
      Changelog:
      	Nov 05 2008: (v4) trivial port on top of always-start-\
      		with-clear-caps patch
      	Sep 23 2008: nixed file_caps_enabled when file caps are
      		not compiled in as it isn't used.
      		Document no_file_caps in kernel-parameters.txt.
      Signed-off-by: NSerge Hallyn <serue@us.ibm.com>
      Acked-by: NAndrew G. Morgan <morgan@kernel.org>
      Signed-off-by: NJames Morris <jmorris@namei.org>
      1f29fae2
  12. 14 8月, 2008 1 次提交
    • D
      security: Fix setting of PF_SUPERPRIV by __capable() · 5cd9c58f
      David Howells 提交于
      Fix the setting of PF_SUPERPRIV by __capable() as it could corrupt the flags
      the target process if that is not the current process and it is trying to
      change its own flags in a different way at the same time.
      
      __capable() is using neither atomic ops nor locking to protect t->flags.  This
      patch removes __capable() and introduces has_capability() that doesn't set
      PF_SUPERPRIV on the process being queried.
      
      This patch further splits security_ptrace() in two:
      
       (1) security_ptrace_may_access().  This passes judgement on whether one
           process may access another only (PTRACE_MODE_ATTACH for ptrace() and
           PTRACE_MODE_READ for /proc), and takes a pointer to the child process.
           current is the parent.
      
       (2) security_ptrace_traceme().  This passes judgement on PTRACE_TRACEME only,
           and takes only a pointer to the parent process.  current is the child.
      
           In Smack and commoncap, this uses has_capability() to determine whether
           the parent will be permitted to use PTRACE_ATTACH if normal checks fail.
           This does not set PF_SUPERPRIV.
      
      Two of the instances of __capable() actually only act on current, and so have
      been changed to calls to capable().
      
      Of the places that were using __capable():
      
       (1) The OOM killer calls __capable() thrice when weighing the killability of a
           process.  All of these now use has_capability().
      
       (2) cap_ptrace() and smack_ptrace() were using __capable() to check to see
           whether the parent was allowed to trace any process.  As mentioned above,
           these have been split.  For PTRACE_ATTACH and /proc, capable() is now
           used, and for PTRACE_TRACEME, has_capability() is used.
      
       (3) cap_safe_nice() only ever saw current, so now uses capable().
      
       (4) smack_setprocattr() rejected accesses to tasks other than current just
           after calling __capable(), so the order of these two tests have been
           switched and capable() is used instead.
      
       (5) In smack_file_send_sigiotask(), we need to allow privileged processes to
           receive SIGIO on files they're manipulating.
      
       (6) In smack_task_wait(), we let a process wait for a privileged process,
           whether or not the process doing the waiting is privileged.
      
      I've tested this with the LTP SELinux and syscalls testscripts.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Acked-by: NSerge Hallyn <serue@us.ibm.com>
      Acked-by: NCasey Schaufler <casey@schaufler-ca.com>
      Acked-by: NAndrew G. Morgan <morgan@kernel.org>
      Acked-by: NAl Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: NJames Morris <jmorris@namei.org>
      5cd9c58f
  13. 25 7月, 2008 1 次提交
  14. 05 7月, 2008 1 次提交
  15. 01 6月, 2008 1 次提交
    • A
      capabilities: remain source compatible with 32-bit raw legacy capability support. · ca05a99a
      Andrew G. Morgan 提交于
      Source code out there hard-codes a notion of what the
      _LINUX_CAPABILITY_VERSION #define means in terms of the semantics of the
      raw capability system calls capget() and capset().  Its unfortunate, but
      true.
      
      Since the confusing header file has been in a released kernel, there is
      software that is erroneously using 64-bit capabilities with the semantics
      of 32-bit compatibilities.  These recently compiled programs may suffer
      corruption of their memory when sys_getcap() overwrites more memory than
      they are coded to expect, and the raising of added capabilities when using
      sys_capset().
      
      As such, this patch does a number of things to clean up the situation
      for all. It
      
        1. forces the _LINUX_CAPABILITY_VERSION define to always retain its
           legacy value.
      
        2. adopts a new #define strategy for the kernel's internal
           implementation of the preferred magic.
      
        3. deprecates v2 capability magic in favor of a new (v3) magic
           number. The functionality of v3 is entirely equivalent to v2,
           the only difference being that the v2 magic causes the kernel
           to log a "deprecated" warning so the admin can find applications
           that may be using v2 inappropriately.
      
      [User space code continues to be encouraged to use the libcap API which
      protects the application from details like this.  libcap-2.10 is the first
      to support v3 capabilities.]
      
      Fixes issue reported in https://bugzilla.redhat.com/show_bug.cgi?id=447518.
      Thanks to Bojan Smojver for the report.
      
      [akpm@linux-foundation.org: s/depreciate/deprecate/g]
      [akpm@linux-foundation.org: be robust about put_user size]
      [akpm@linux-foundation.org: coding-style fixes]
      Signed-off-by: NAndrew G. Morgan <morgan@kernel.org>
      Cc: Serge E. Hallyn <serue@us.ibm.com>
      Cc: Bojan Smojver <bojan@rexursive.com>
      Cc: stable@kernel.org
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NChris Wright <chrisw@sous-sol.org>
      ca05a99a
  16. 06 2月, 2008 1 次提交
  17. 20 10月, 2007 4 次提交
    • P
      Uninline find_pid etc set of functions · 8990571e
      Pavel Emelyanov 提交于
      The find_pid/_vpid/_pid_ns functions are used to find the struct pid by its
      id, depending on whic id - global or virtual - is used.
      
      The find_vpid() is a macro that pushes the current->nsproxy->pid_ns on the
      stack to call another function - find_pid_ns().  It turned out, that this
      dereference together with the push itself cause the kernel text size to
      grow too much.
      
      Move all these out-of-line.  Together with the previous patch this saves a
      bit less that 400 bytes from .text section.
      Signed-off-by: NPavel Emelyanov <xemul@openvz.org>
      Cc: Sukadev Bhattiprolu <sukadev@us.ibm.com>
      Cc: Oleg Nesterov <oleg@tv-sign.ru>
      Cc: Paul Menage <menage@google.com>
      Cc: "Eric W. Biederman" <ebiederm@xmission.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      8990571e
    • P
      Uninline find_task_by_xxx set of functions · 228ebcbe
      Pavel Emelyanov 提交于
      The find_task_by_something is a set of macros are used to find task by pid
      depending on what kind of pid is proposed - global or virtual one.  All of
      them are wrappers above the most generic one - find_task_by_pid_type_ns() -
      and just substitute some args for it.
      
      It turned out, that dereferencing the current->nsproxy->pid_ns construction
      and pushing one more argument on the stack inline cause kernel text size to
      grow.
      
      This patch moves all this stuff out-of-line into kernel/pid.c.  Together
      with the next patch it saves a bit less than 400 bytes from the .text
      section.
      Signed-off-by: NPavel Emelyanov <xemul@openvz.org>
      Cc: Sukadev Bhattiprolu <sukadev@us.ibm.com>
      Cc: Oleg Nesterov <oleg@tv-sign.ru>
      Cc: Paul Menage <menage@google.com>
      Cc: "Eric W. Biederman" <ebiederm@xmission.com>
      Acked-by: NIngo Molnar <mingo@elte.hu>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      228ebcbe
    • P
      pid namespaces: changes to show virtual ids to user · b488893a
      Pavel Emelyanov 提交于
      This is the largest patch in the set. Make all (I hope) the places where
      the pid is shown to or get from user operate on the virtual pids.
      
      The idea is:
       - all in-kernel data structures must store either struct pid itself
         or the pid's global nr, obtained with pid_nr() call;
       - when seeking the task from kernel code with the stored id one
         should use find_task_by_pid() call that works with global pids;
       - when showing pid's numerical value to the user the virtual one
         should be used, but however when one shows task's pid outside this
         task's namespace the global one is to be used;
       - when getting the pid from userspace one need to consider this as
         the virtual one and use appropriate task/pid-searching functions.
      
      [akpm@linux-foundation.org: build fix]
      [akpm@linux-foundation.org: nuther build fix]
      [akpm@linux-foundation.org: yet nuther build fix]
      [akpm@linux-foundation.org: remove unneeded casts]
      Signed-off-by: NPavel Emelyanov <xemul@openvz.org>
      Signed-off-by: NAlexey Dobriyan <adobriyan@openvz.org>
      Cc: Sukadev Bhattiprolu <sukadev@us.ibm.com>
      Cc: Oleg Nesterov <oleg@tv-sign.ru>
      Cc: Paul Menage <menage@google.com>
      Cc: "Eric W. Biederman" <ebiederm@xmission.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      b488893a
    • S
      pid namespaces: define is_global_init() and is_container_init() · b460cbc5
      Serge E. Hallyn 提交于
      is_init() is an ambiguous name for the pid==1 check.  Split it into
      is_global_init() and is_container_init().
      
      A cgroup init has it's tsk->pid == 1.
      
      A global init also has it's tsk->pid == 1 and it's active pid namespace
      is the init_pid_ns.  But rather than check the active pid namespace,
      compare the task structure with 'init_pid_ns.child_reaper', which is
      initialized during boot to the /sbin/init process and never changes.
      
      Changelog:
      
      	2.6.22-rc4-mm2-pidns1:
      	- Use 'init_pid_ns.child_reaper' to determine if a given task is the
      	  global init (/sbin/init) process. This would improve performance
      	  and remove dependence on the task_pid().
      
      	2.6.21-mm2-pidns2:
      
      	- [Sukadev Bhattiprolu] Changed is_container_init() calls in {powerpc,
      	  ppc,avr32}/traps.c for the _exception() call to is_global_init().
      	  This way, we kill only the cgroup if the cgroup's init has a
      	  bug rather than force a kernel panic.
      
      [akpm@linux-foundation.org: fix comment]
      [sukadev@us.ibm.com: Use is_global_init() in arch/m32r/mm/fault.c]
      [bunk@stusta.de: kernel/pid.c: remove unused exports]
      [sukadev@us.ibm.com: Fix capability.c to work with threaded init]
      Signed-off-by: NSerge E. Hallyn <serue@us.ibm.com>
      Signed-off-by: NSukadev Bhattiprolu <sukadev@us.ibm.com>
      Acked-by: NPavel Emelianov <xemul@openvz.org>
      Cc: Eric W. Biederman <ebiederm@xmission.com>
      Cc: Cedric Le Goater <clg@fr.ibm.com>
      Cc: Dave Hansen <haveblue@us.ibm.com>
      Cc: Herbert Poetzel <herbert@13thfloor.at>
      Cc: Kirill Korotaev <dev@sw.ru>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      b460cbc5
  18. 19 10月, 2007 2 次提交
    • D
      whitespace fixes: capability syscalls · 314f70fd
      Daniel Walker 提交于
      Large chunks of 5 spaces instead of tabs.
      Signed-off-by: NDaniel Walker <dwalker@mvista.com>
      Cc: Chris Wright <chrisw@sous-sol.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      314f70fd
    • A
      V3 file capabilities: alter behavior of cap_setpcap · 72c2d582
      Andrew Morgan 提交于
      The non-filesystem capability meaning of CAP_SETPCAP is that a process, p1,
      can change the capabilities of another process, p2.  This is not the
      meaning that was intended for this capability at all, and this
      implementation came about purely because, without filesystem capabilities,
      there was no way to use capabilities without one process bestowing them on
      another.
      
      Since we now have a filesystem support for capabilities we can fix the
      implementation of CAP_SETPCAP.
      
      The most significant thing about this change is that, with it in effect, no
      process can set the capabilities of another process.
      
      The capabilities of a program are set via the capability convolution
      rules:
      
         pI(post-exec) = pI(pre-exec)
         pP(post-exec) = (X(aka cap_bset) & fP) | (pI(post-exec) & fI)
         pE(post-exec) = fE ? pP(post-exec) : 0
      
      at exec() time.  As such, the only influence the pre-exec() program can
      have on the post-exec() program's capabilities are through the pI
      capability set.
      
      The correct implementation for CAP_SETPCAP (and that enabled by this patch)
      is that it can be used to add extra pI capabilities to the current process
      - to be picked up by subsequent exec()s when the above convolution rules
      are applied.
      
      Here is how it works:
      
      Let's say we have a process, p. It has capability sets, pE, pP and pI.
      Generally, p, can change the value of its own pI to pI' where
      
         (pI' & ~pI) & ~pP = 0.
      
      That is, the only new things in pI' that were not present in pI need to
      be present in pP.
      
      The role of CAP_SETPCAP is basically to permit changes to pI beyond
      the above:
      
         if (pE & CAP_SETPCAP) {
            pI' = anything; /* ie., even (pI' & ~pI) & ~pP != 0  */
         }
      
      This capability is useful for things like login, which (say, via
      pam_cap) might want to raise certain inheritable capabilities for use
      by the children of the logged-in user's shell, but those capabilities
      are not useful to or needed by the login program itself.
      
      One such use might be to limit who can run ping. You set the
      capabilities of the 'ping' program to be "= cap_net_raw+i", and then
      only shells that have (pI & CAP_NET_RAW) will be able to run
      it. Without CAP_SETPCAP implemented as described above, login(pam_cap)
      would have to also have (pP & CAP_NET_RAW) in order to raise this
      capability and pass it on through the inheritable set.
      Signed-off-by: NAndrew Morgan <morgan@kernel.org>
      Signed-off-by: NSerge E. Hallyn <serue@us.ibm.com>
      Cc: Stephen Smalley <sds@tycho.nsa.gov>
      Cc: James Morris <jmorris@namei.org>
      Cc: Casey Schaufler <casey@schaufler-ca.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      72c2d582
  19. 17 10月, 2007 1 次提交
  20. 13 2月, 2007 1 次提交
  21. 30 9月, 2006 1 次提交
  22. 04 7月, 2006 1 次提交
  23. 26 3月, 2006 1 次提交
  24. 12 1月, 2006 1 次提交
  25. 28 7月, 2005 1 次提交
  26. 17 4月, 2005 1 次提交
    • L
      Linux-2.6.12-rc2 · 1da177e4
      Linus Torvalds 提交于
      Initial git repository build. I'm not bothering with the full history,
      even though we have it. We can create a separate "historical" git
      archive of that later if we want to, and in the meantime it's about
      3.2GB when imported into git - space that would just make the early
      git days unnecessarily complicated, when we don't have a lot of good
      infrastructure for it.
      
      Let it rip!
      1da177e4